Steal the day and cheat your JVM with JavaAgent

Everyone familiar with Spring should know aop better. Aspect oriented programming allows us to weave the logic we want to execute before and after the target method. Today's Java Agent technology is similar to aop in thought. It can be translated as Java Agent and Java probe technology.

Java Agent appeared after JDK1.5. It allows programmers to use agent technology to build an agent program independent of application programs. It has a wide range of uses. It can help monitor, run, and even replace programs on other JVM s. First, take a visual look at the scenarios in which it is applied from the following figure:

Seeing here, are you also curious about what immortal technology can be applied in so many scenarios? Today, let's explore and see how the magical Java Agent works at the bottom and silently supports so many excellent applications.

Returning to the analogy at the beginning of the article, let's first have a general understanding of Java Agent by comparing it with aop:

  • Action level: aop runs at the method level within the application, while agent can act at the virtual machine level

  • Components: the implementation of aop requires the target method and the method of logic enhancement part, and the Java Agent needs two projects to take effect, one is the agent agent and the other is the main program to be agent

  • Execution occasion: aop can run in front of or around the section, while there are only two ways to execute Java Agent. The preMain mode provided by jdk1.5 is executed before the main program runs, and the agentMain provided by jdk1.6 is executed after the main program runs

Let's take a look at how to implement an agent program in two modes.

Premain mode

Premain mode allows an agent agent to be executed before the main program is executed. It is very simple to implement. Next, we implement two components respectively.

agent

First write a simple function, print a sentence before the main program is executed, and print the parameters passed to the agent:

public class MyPreMainAgent {
    public static void premain(String agentArgs, Instrumentation inst) {
        System.out.println("premain start");
        System.out.println("args:"+agentArgs);
    }
}

After writing the logic of the agent, we need to package it into a jar file. Here, we directly use the maven plug-in to package and configure it before packaging.

<build>
    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-jar-plugin</artifactId>
            <version>3.1.0</version>
            <configuration>
                <archive>
                    <manifest>
                        <addClasspath>true</addClasspath>
                    </manifest>
                    <manifestEntries>
                        <Premain-Class>com.cn.agent.MyPreMainAgent</Premain-Class>                            
                        <Can-Redefine-Classes>true</Can-Redefine-Classes>
                        <Can-Retransform-Classes>true</Can-Retransform-Classes>
                        <Can-Set-Native-Method-Prefix>true</Can-Set-Native-Method-Prefix>
                    </manifestEntries>
                </archive>
            </configuration>
        </plugin>
    </plugins>
</build>

In the configured packaging parameters, add attributes to the MANIFEST.MF file through manifestEntries. Explain the following parameters:

  • Premain class: class containing premain method, which needs to be configured as the full path of the class

  • Can redefine classes: when true, it means that classes can be redefined

  • Can retransform classes: when it is true, it means that the class can be retransmitted to realize bytecode replacement

  • Can set native method prefix: when true, it indicates that the prefix of the native method can be set

Premain class is mandatory, and the other items are non mandatory. They are all false by default. It is generally recommended to add these functions, which will be described in detail later. After the configuration is completed, use the mvn command to package:

mvn clean package

After packaging, the myAgent-1.0.jar file is generated. We can decompress the jar file and take a look at the generated MANIFEST.MF file:

You can see that the added attributes have been added to the file. Here, the agent part is completed. Because the agent cannot run directly and needs to be attached to other programs, a new project is created to implement the main program.

main program

In the project of the main program, you only need an entry to the main method that can be executed.

public class AgentTest {
    public static void main(String[] args) {
        System.out.println("main project start");
    }
}

After the completion of the main program, we should consider how to connect the main program with the agent project. Here, you can specify the running agent through the - javaagent parameter. The command format is as follows:

java -javaagent:myAgent.jar -jar AgentTest.jar

In addition, there is no limit to the number of agents that can be specified. Each agent will be executed successively according to the specified order. If you want to run two agents at the same time, you can follow the following command:

java -javaagent:myAgent1.jar -javaagent:myAgent2.jar  -jar AgentTest.jar

Take the program executed in idea as an example, add startup parameters to VM options:

-javaagent:F:\Workspace\MyAgent\target\myAgent-1.0.jar=Hydra
-javaagent:F:\Workspace\MyAgent\target\myAgent-1.0.jar=Trunks

Execute the main method to view the output results:

According to the printed statements of the execution results, we can see that our agent agent is executed twice before executing the main program. The execution sequence of the execution agent and the main program can be represented by the following figure.

defect

While providing convenience, premain mode also has some defects. For example, if an exception occurs during the operation of the agent, it will also lead to the failure of the start-up of the main program. We modified the agent code in the above example and threw an exception manually.

public static void premain(String agentArgs, Instrumentation inst) {
    System.out.println("premain start");
    System.out.println("args:"+agentArgs);
    throw new RuntimeException("error");
}

Run the main program again:

It can be seen that the main program is not started after the agent throws an exception. Aiming at some defects of premain mode, agent main mode is introduced after jdk1.6.

Agentmain mode

agentmain mode can be said to be an upgraded version of premain. It allows the jvm of the agent's target main program to start first, and then connect the two JVMs through the attach mechanism. We will implement it in three parts below.

agent

The agent part is the same as above to realize simple printing function:

public class MyAgentMain {
    public static void agentmain(String agentArgs, Instrumentation instrumentation) {
        System.out.println("agent main start");
        System.out.println("args:"+agentArgs);
    }
}

Modify the maven plug-in configuration and specify the agent class:

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-jar-plugin</artifactId>
    <version>3.1.0</version>
    <configuration>
        <archive>
            <manifest>
                <addClasspath>true</addClasspath>
            </manifest>
            <manifestEntries>
                <Agent-Class>com.cn.agent.MyAgentMain</Agent-Class>
                <Can-Redefine-Classes>true</Can-Redefine-Classes>
                <Can-Retransform-Classes>true</Can-Retransform-Classes>
            </manifestEntries>
        </archive>
    </configuration>
</plugin>

main program

Here, we directly start the main program and wait for the agent to be loaded. System.in is used in the main program to block to prevent the main process from ending ahead of time.

public class AgentmainTest {
    public static void main(String[] args) throws IOException {
        System.in.read();
    }
}

attach mechanism

Unlike the premain mode, we can no longer connect the agent and the main program by adding startup parameters. Here, we need to use the VirtualMachine tool class under the com.sun.tools.attach package. It should be noted that this class is not a jvm Standard Specification, but implemented by Sun itself. Before use, we need to introduce dependencies:

<dependency>
    <groupId>com.sun</groupId>
    <artifactId>tools</artifactId>
    <version>1.8</version>
    <scope>system</scope>
    <systemPath>${JAVA_HOME}\lib\tools.jar</systemPath>
</dependency>

VirtualMachine represents a java virtual machine to be attached, that is, the target virtual machine to be monitored in the program. External processes can use the instance of VirtualMachine to load the agent into the target virtual machine. Let's take a look at its static method attach:

public static VirtualMachine attach(String var0);

You can obtain an object instance of a jvm through the attach method. The parameter passed in here is the process number pid of the target virtual machine runtime. In other words, before using attach, we need to obtain the pid of the main program just started and use the jps command to view the thread pid:

11140
16372 RemoteMavenServer36
16392 AgentmainTest
20204 Jps
2460 Launcher

Get that the main program agentmantest runtime pid is 16392 and apply it to the connection of the virtual machine.

public class AttachTest {
    public static void main(String[] args) {
        try {
            VirtualMachine  vm= VirtualMachine.attach("16392");
            vm.loadAgent("F:\\Workspace\\MyAgent\\target\\myAgent-1.0.jar","param");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

After obtaining the VirtualMachine instance, you can inject the agent agent class through the loadAgent method. The first parameter of the method is the local path of the agent, and the second parameter is the parameter passed to the agent. Execute AttachTest and return to the console of the main program agentmantest. You can see that the code in the agent has been executed:

In this way, a simple agentMain pattern agent is completed. You can sort out the relationship between the three modules through the following figure.

application

Here, we have simply understood the implementation methods of the two modes, but as high-quality programmers, we can't be satisfied with printing statements only by agents. Let's see how we can use Java agents to do some practical things.

In the above two modes, the logic of the agent part is implemented in the premain method and the agentmain method respectively. Moreover, these two methods have strict requirements on parameters in signature. The premain method allows the following two methods to be defined:

public static void premain(String agentArgs)
public static void premain(String agentArgs, Instrumentation inst)

The agentmain method allows you to define in two ways:

public static void agentmain(String agentArgs)
public static void agentmain(String agentArgs, Instrumentation inst)

If there are two signature methods in the agent at the same time, the method with Instrumentation parameter has higher priority and will be loaded by the jvm first, and its instance inst will be automatically injected by the jvm. Let's see what functions can be realized through Instrumentation.

Instrumentation

First, let's briefly introduce the Instrumentation interface. Its methods allow Java programs to be operated at run time, and provide functions such as changing bytecode, adding jar package, replacing class, etc. through these functions, Java has stronger dynamic control and interpretation ability. In the process of writing agent, the following three methods in Instrumentation are important and commonly used. Let's focus on them.

addTransformer

The addTransformer method allows us to redefine the Class before loading the Class. Let's take a look at the definition of the method first:

void addTransformer(ClassFileTransformer transformer);

ClassFileTransformer is an interface with only one transform method. Before the main method of the main program is executed, each loaded Class must be transformed and executed once. It can be called a converter. We can implement this method to redefine Class. Let's see how to use it through an example.

First, create a Fruit class in the main program project:

public class Fruit {
    public void getFruit(){
        System.out.println("banana");
    }
}

After compilation, copy a class file, rename it to Fruit2.class, and then modify the method in Fruit as follows:

public void getFruit(){
    System.out.println("apple");
}

Create the main program, create a Fruit object in the main program and call its getFruit method:

public class TransformMain {
    public static void main(String[] args) {
        new Fruit().getFruit();
    }
}

At this time, the execution result will print the apple, and then start to implement the premain proxy part.

In the agent's premain method, use the addTransformer method of Instrumentation to intercept class loading:

public class TransformAgent {
    public static void premain(String agentArgs, Instrumentation inst) {
        inst.addTransformer(new FruitTransformer());
    }
}

The FruitTransformer class implements the ClassFileTransformer interface, and the logic of the transformation class is in the transform method:

public class FruitTransformer implements ClassFileTransformer {
    @Override
    public byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined,
                            ProtectionDomain protectionDomain, byte[] classfileBuffer){
        if (!className.equals("com/cn/hydra/test/Fruit"))
            return classfileBuffer;

        String fileName="F:\\Workspace\\agent-test\\target\\classes\\com\\cn\\hydra\\test\\Fruit2.class";
        return getClassBytes(fileName);
    }

    public static byte[] getClassBytes(String fileName){
        File file = new File(fileName);
        try(InputStream is = new FileInputStream(file);
            ByteArrayOutputStream bs = new ByteArrayOutputStream()){
            long length = file.length();
            byte[] bytes = new byte[(int) length];

            int n;
            while ((n = is.read(bytes)) != -1) {
                bs.write(bytes, 0, n);
            }
            return bytes;
        }catch (Exception e) {
            e.printStackTrace();
            return null;
        }
    }
}

In the transform method, two main things are done:

  • Because the addTransformer method cannot specify the class to be converted, we need to judge whether the currently loaded class is the target class we want to intercept through the className. For non target classes, we directly return the original byte array. Note the format of the className and replace. In the fully qualified name of the class with/

  • Read the class file we copied before, read in the binary character stream, replace the original classfileBuffer byte array and return it to complete the replacement of class definition

After the agent part is packaged, add startup parameters in the main program:

-javaagent:F:\Workspace\MyAgent\target\transformAgent-1.0.jar

Execute the main program again and print the results:

banana

In this way, the replacement of class before the execution of the main method is realized.

redefineClasses

We can intuitively understand its function from the name of the method, redefine the class, and generally speaking, implement the replacement of the specified class. The method is defined as follows:

void redefineClasses(ClassDefinition... definitions) throws  ClassNotFoundException, UnmodifiableClassException;

Its parameter is a variable length ClassDefinition array. Let's look at the construction method of ClassDefinition:

public ClassDefinition(Class<?> theClass,byte[] theClassFile) {...}

The Class object and the modified bytecode array specified in the ClassDefinition are simply to replace the original Class with the provided Class file bytes. In addition, during redefinition of redefinitecclasses method, an array of ClassDefinition is passed in. It will be loaded according to this array order to meet changes in the case of interdependence between classes.

Let's take a look at its validation process through an example. The premain agent part:

public class RedefineAgent {
    public static void premain(String agentArgs, Instrumentation inst) 
            throws UnmodifiableClassException, ClassNotFoundException {
        String fileName="F:\\Workspace\\agent-test\\target\\classes\\com\\cn\\hydra\\test\\Fruit2.class";
        ClassDefinition def=new ClassDefinition(Fruit.class,
                FruitTransformer.getClassBytes(fileName));
        inst.redefineClasses(new ClassDefinition[]{def});
    }
}

The main program can directly reuse the above and print after execution:

banana

You can see that the original class is replaced with the bytes of the class file specified by us, that is, the replacement of the specified class is realized.

retransformClasses

retransformClasses is applied to the agentmain mode. You can redefine the Class after the Class is loaded, that is, trigger the Class reload. Let's first look at the definition of this method:

void retransformClasses(Class<?>... classes) throws UnmodifiableClassException;

Its parameter classes is an array of classes that need to be converted. The variable length parameter also shows that it can also convert class definitions in batch, just like the redefineClasses method.

Next, let's take an example to see how to use the retransformClasses method. The code of the agent part is as follows:

public class RetransformAgent {
    public static void agentmain(String agentArgs, Instrumentation inst)
            throws UnmodifiableClassException {
        inst.addTransformer(new FruitTransformer(),true);
        inst.retransformClasses(Fruit.class);
        System.out.println("retransform success");
    }
}

Take a look at the definition of the addTransformer method called here, which is slightly different from the above:

void addTransformer(ClassFileTransformer transformer, boolean canRetransform);

The ClassFileTransformer converter still reuses the above FruitTransformer. Focus on the second newly added parameter. When canRetransform is true, it means that class redefinition is allowed. At this time, it is equivalent to calling the transform method in the converter ClassFileTransformer, and the bytes of the converted class will be loaded as a new class definition.

For the code of the main program, we continuously execute print statements in the dead loop to monitor whether the class has changed:

public class RetransformMain {
    public static void main(String[] args) throws InterruptedException {
        while(true){
            new Fruit().getFruit();
            TimeUnit.SECONDS.sleep(5);
        }
    }
}

Finally, use the attach api to inject the agent agent into the main program:

public class AttachRetransform {
    public static void main(String[] args) throws Exception {
        VirtualMachine vm = VirtualMachine.attach("6380");
        vm.loadAgent("F:\\Workspace\\MyAgent\\target\\retransformAgent-1.0.jar");
    }
}

Return to the main program console and view the running results:

You can see that the print statement changes after the agent is injected, indicating that the class definition has been changed and reloaded.

other

In addition to these main methods, there are other methods in Instrumentation. Here is a brief list of the functions of common methods:

  • removeTransformer: deletes a ClassFileTransformer class converter

  • getAllLoadedClasses: get the currently loaded classes

  • getInitiatedClasses: gets the Class loaded by the specified ClassLoader

  • getObjectSize: gets the size of the space occupied by an object

  • appendToBootstrapClassLoaderSearch: add the jar package to the boot class loader

  • appendToSystemClassLoaderSearch: add the jar package to the system classloader

  • isNativeMethodPrefixSupported: judge whether the native method can be prefixed, that is, whether the native method can be intercepted

  • setNativeMethodPrefix: sets the prefix of the native method

Javassist

In the above examples, we directly read the bytes in the class file to redefine or convert the class. However, in the actual working environment, we may more dynamically modify the bytecode of the class file. At this time, we can use javassist to modify the bytecode file more simply.

In short, javassist is a class library for analyzing, editing and creating java bytecode. When in use, we can directly call its api to dynamically change or generate the structure of class in the form of coding. Compared with ASM and other bytecode frameworks that require understanding the underlying virtual machine instructions, javassist is really very simple and fast.

Next, let's take a simple example to see how to use Java agent and javassist together. First, introduce the dependency of javassist:

<dependency>
    <groupId>org.javassist</groupId>
    <artifactId>javassist</artifactId>
    <version>3.20.0-GA</version>
</dependency>

The function we want to achieve is to calculate the execution time of the method through the agent. The premain proxy part is basically the same as before. First, add a converter:

public class Agent {
    public static void premain(String agentArgs, Instrumentation inst) {
        inst.addTransformer(new LogTransformer());
    }

    static class LogTransformer implements ClassFileTransformer {
        @Override
        public byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined, 
                                ProtectionDomain protectionDomain, byte[] classfileBuffer) 
            throws IllegalClassFormatException {
            if (!className.equals("com/cn/hydra/test/Fruit"))
                return null;

            try {
                return calculate();
            } catch (Exception e) {
                e.printStackTrace();
                return null;
            }
        }
    }
}

In the calculate method, javassist is used to dynamically change the definition of the method:

static byte[] calculate() throws Exception {
    ClassPool pool = ClassPool.getDefault();
    CtClass ctClass = pool.get("com.cn.hydra.test.Fruit");
    CtMethod ctMethod = ctClass.getDeclaredMethod("getFruit");
    CtMethod copyMethod = CtNewMethod.copy(ctMethod, ctClass, new ClassMap());
    ctMethod.setName("getFruit$agent");

    StringBuffer body = new StringBuffer("{\n")
            .append("long begin = System.nanoTime();\n")
            .append("getFruit$agent($$);\n")
            .append("System.out.println(\"use \"+(System.nanoTime() - begin) +\" ns\");\n")
            .append("}");
    copyMethod.setBody(body.toString());
    ctClass.addMethod(copyMethod);
    return ctClass.toBytecode();
}

In the above code, these functions are mainly realized:

  • Get class CtClass with fully qualified name

  • Obtain the method CtMethod according to the method name, and copy a new method through the CtNewMethod.copy method

  • The method name for modifying the old method is getFruit$agent

  • Modify the contents of the copied method through the setBody method, enhance the logic in the new method, call the old method, and finally add the new method to the class

The main program still reuses the previous code, executes and views the results, and completes the execution time statistics function in the agent:

At this time, we can look at it through reflection:

for (Method method : Fruit.class.getDeclaredMethods()) {
    System.out.println(method.getName());
    method.invoke(new Fruit());
    System.out.println("-------");
}

Viewing the results, you can see that a new method has been added to the class:

In addition, javassist has many other functions, such as creating a new Class, setting a parent Class, reading and writing bytecode, etc. you can learn its usage in specific scenarios.

summary

Although there may not be many scenarios where we directly use Java agents in our ordinary work, they may be hidden in the corner of the business system and have been quietly playing a great role in hot deployment, monitoring, performance analysis and other tools.

This paper starts with the two modes of Java Agent, manually implements and briefly analyzes their workflow. Although they only use them to complete some simple functions here, it has to be said that it is the emergence of Java Agent that makes the operation of the program no longer follow the rules, and also provides unlimited possibilities for our code.

Tags: Java Programming Back-end Programmer

Posted on Sun, 21 Nov 2021 20:08:06 -0500 by blackandwhite