Serialization of Lambda expression and ingenious use of SerializedLambda in JDK

premise

In my spare time, I want to take Javassist as the core and write a lightweight ORM framework based on JDBC to discard reflection calls. In the process, we have read the source code of mybatis, tk-mapper, mybatis-plus and spring-boot-starter-jdbc, and found that LambdaQueryWrapper in mybatis-plus can get the method information in Lambda expression that is currently invoked. (actually CallSite information), here is a complete record. This article is written based on JDK11, and other versions of JDK are not necessarily suitable.

Magic Lambda expression serialization

When looking at the source code implementation of Lambda expression, I didn't look at the comments of LambdaMetafactory. One of the comments at the top of this class is as follows:

Serializable feature. Generally, the generated function object (which should specifically refer to the special function object implemented based on Lambda expression) does not need to support the serialization feature. If this feature needs to be supported, FLAG_SERIALIZABLE (a static integer property of LambdaMetafactory, with a value of 1 < < 0) It can be used to indicate that function objects are serialized. Once function objects supporting serialization are used, they are serialized in the form of SerializedLambda classes. These SerializedLambda instances need the assistance of additional "capture classes" (capture classes, as described in the caller parameter of MethodHandles.Lookup). For details, see SerializedLambda.

Search FLAG_SERIALIZABLE in the comments of LambdaMetafactory, and you can see this comment:

The main idea is: after FLAG_SERIALIZABLE flag is set, the generated function object instance will implement the Serializable interface, and there will be a method named writeReplace whose return value type is SerializedLambda. Call the methods of these function objects (the "capture class" mentioned earlier) The caller of must have a method named $deserializeLambda $, as described by the SerializedLambda class.

Finally, look at the description of SerializedLambda. There are four sections in the annotation, which are posted here and the core information is extracted from each section:

The main idea of each paragraph is as follows:

  • Paragraph 1: SerializedLambda is a serialized form of Lambda expression, which stores the runtime information of Lambda expression
  • Paragraph 2: in order to ensure the correctness of the serialization implementation of Lambda expressions, the compiler or language class library can choose to ensure that the writeReplace method returns a SerializedLambda instance
  • Paragraph 3: SerializedLambda provides a readResolve method. Its function is similar to calling the static method $deserializeLambda$(SerializedLambda) in the "capture class" and taking its own instance as an input parameter. This process is understood as a deserialization process
  • Paragraph 4: the identification form of identity sensitive operations of function objects generated by serialization and deserialization (such as System.identityHashCode(), object locking, etc.) is unpredictable

The final conclusion is: if a functional interface implements the Serializable interface, its instance will automatically generate a writeReplace method that returns the SerializedLambda instance. You can get the runtime information of the functional interface from the SerializedLambda instance. These runtime information are the properties of SerializedLambda:

attribute meaning
capturingClass Capture class: the class where the current Lambda expression appears
functionalInterfaceClass Name, separated by "/", the static type of the returned Lambda object
functionalInterfaceMethodName Functional interface method name
functionalInterfaceMethodSignature Functional interface method signature (actually parameter type and return value type. If generic type is used, it is the erased type)
implClass Name, separated by "/", holding the type of the implementation method of the functional interface method (the implementation class that implements the functional interface method)
implMethodName Implementation method name of functional interface method
implMethodSignature Method signature of the implementation method of functional interface method (actually parameter type and return value type)
instantiatedMethodType Replace the functional interface type with the instance type variable
capturedArgs Dynamic parameters captured by Lambda
implMethodKind The MethodHandle type that implements the method

For a practical example, define a functional interface that implements Serializable and call it:

public class App {

    @FunctionalInterface
    public interface CustomerFunction<S, T> extends Serializable {

        T convert(S source);
    }

    public static void main(String[] args) throws Exception {
        CustomerFunction<String, Long> function = Long::parseLong;
        Long result = function.convert("123");
        System.out.println(result);
        Method method = function.getClass().getDeclaredMethod("writeReplace");
        method.setAccessible(true);
        SerializedLambda serializedLambda = (SerializedLambda)method.invoke(function);
        System.out.println(serializedLambda.getCapturingClass());
    }
}

The DEBUG information executed is as follows:

In this way, you can get the runtime information of the call point of the functional interface instance when calling the method, and even the type before the generic parameter is erased, so many skills can be derived. For example:

public class ConditionApp {

    @FunctionalInterface
    public interface CustomerFunction<S, T> extends Serializable {

        T convert(S source);
    }

    @Data
    public static class User {

        private String name;
        private String site;
    }

    public static void main(String[] args) throws Exception {
        Condition c1 = addCondition(User::getName, "=", "throwable");
        System.out.println("c1 = " + c1);
        Condition c2 = addCondition(User::getSite, "IN", "('throwx.cn','vlts.cn')");
        System.out.println("c1 = " + c2);
    }

    private static <S> Condition addCondition(CustomerFunction<S, String> function,
                                              String operation,
                                              Object value) throws Exception {
        Condition condition = new Condition();
        Method method = function.getClass().getDeclaredMethod("writeReplace");
        method.setAccessible(true);
        SerializedLambda serializedLambda = (SerializedLambda) method.invoke(function);
        String implMethodName = serializedLambda.getImplMethodName();
        int idx;
        if ((idx = implMethodName.lastIndexOf("get")) >= 0) {
            condition.setField(Character.toLowerCase(implMethodName.charAt(idx + 3)) + implMethodName.substring(idx + 4));
        }
        condition.setEntityKlass(Class.forName(serializedLambda.getImplClass().replace("/", ".")));
        condition.setOperation(operation);
        condition.setValue(value);
        return condition;
    }

    @Data
    private static class Condition {

        private Class<?> entityKlass;
        private String field;
        private String operation;
        private Object value;
    }
}

// results of enforcement
c1 = ConditionApp.Condition(entityKlass=class club.throwable.lambda.ConditionApp$User, field=name, operation==, value=throwable)
c1 = ConditionApp.Condition(entityKlass=class club.throwable.lambda.ConditionApp$User, field=site, operation=IN, value=('throwx.cn','vlts.cn'))

Many people worry about the performance of reflection calls. In fact, in the high version of JDK, the reflection performance has been greatly optimized, which is very close to the performance of direct calls. Moreover, some scenes are a small number of reflection calls, which can be used safely.

I spent a lot of time on the function and use of SerializedLambda, and then looked at the serialization and deserialization of Lambda expressions:

public class SerializedLambdaApp {

    @FunctionalInterface
    public interface CustomRunnable extends Serializable {

        void run();
    }

    public static void main(String[] args) throws Exception {
        invoke(() -> {
        });
    }

    private static void invoke(CustomRunnable customRunnable) throws Exception {
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        ObjectOutputStream oos = new ObjectOutputStream(baos);
        oos.writeObject(customRunnable);
        oos.close();
        ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(baos.toByteArray()));
        Object target = ois.readObject();
        System.out.println(target);
    }
}

The results are as follows:

Lambda expression serialization principle

For the principle of Lambda expression serialization, you can directly refer to the source code of ObjectStreamClass, ObjectOutputStream and ObjectInputStream. Here is the conclusion:

  • Prerequisite: the object to be serialized needs to implement the Serializable interface
  • If there is a writeReplace method in the object to be serialized, the return value type obtained by calling this method directly based on the incoming instance reflection is used as the target type of serialization. For Lambda expressions, it is SerializedLambda type
  • The deserialization process is just a reverse process, and the called method is readResolve. As mentioned earlier, SerializedLambda also has a private method with the same name
  • The implementation type of lambda expression is the template class generated by VM. From the result, the instance before serialization and the instance after deserialization belong to different template classes. For the example in the previous section, the template class before serialization is club.throwable.lambda.serializedlambdaapp $$lambda $14 / 0x000000080065840, and the template class after deserialization is club.thro wable.lambda.SerializedLambdaApp$$Lambda$26/0x00000008000a4040

ObjectStreamClass is the class descriptor of the serialization and deserialization implementation. The class description information of object serialization and deserialization can be found from the member properties of this class, such as the writeReplace and readResolve methods mentioned here

The graphical process is as follows:

How to get SerializedLambda

Through the previous analysis, we know that there are two ways to obtain SerializedLambda instances of Lambda expressions:

  • Method 1: call writeReplace method based on Lambda expression instance and template class reflection of Lambda expression, and the return value is SerializedLambda instance
  • Method 2: obtain SerializedLambda instances based on serialization and deserialization

Based on these two methods, examples can be written respectively. For example, the reflection method is as follows:

// Reflection mode
public class ReflectionSolution {

    @FunctionalInterface
    public interface CustomerFunction<S, T> extends Serializable {

        T convert(S source);
    }

    public static void main(String[] args) throws Exception {
        CustomerFunction<String, Long> function = Long::parseLong;
        SerializedLambda serializedLambda = getSerializedLambda(function);
        System.out.println(serializedLambda.getCapturingClass());
    }

    public static SerializedLambda getSerializedLambda(Serializable serializable) throws Exception {
        Method writeReplaceMethod = serializable.getClass().getDeclaredMethod("writeReplace");
        writeReplaceMethod.setAccessible(true);
        return (SerializedLambda) writeReplaceMethod.invoke(serializable);
    }
}

The serialization and deserialization methods will be slightly complicated, because the ObjectInputStream.readObject() method will eventually call back the SerializedLambda.readResolve() method, resulting in the returned result of a Lambda expression instance carried by the new template class. Therefore, we need to find a way to break this call and return the result in advance, The scheme is to construct a shadow type similar to SerializedLambda but without readResolve() method:

package cn.vlts;
import java.io.Serializable;

/**
 * Note here that it must have the same name as java.lang.invoke.SerializedLambda, and different package names can be used. This is to "cheat" that there is a magical class name in ObjectStreamClass and judge the classNamesEqual() method
 */
@SuppressWarnings("ALL")
public class SerializedLambda implements Serializable {
    private static final long serialVersionUID = 8025925345765570181L;
    private  Class<?> capturingClass;
    private  String functionalInterfaceClass;
    private  String functionalInterfaceMethodName;
    private  String functionalInterfaceMethodSignature;
    private  String implClass;
    private  String implMethodName;
    private  String implMethodSignature;
    private  int implMethodKind;
    private  String instantiatedMethodType;
    private  Object[] capturedArgs;

    public String getCapturingClass() {
        return capturingClass.getName().replace('.', '/');
    }
    public String getFunctionalInterfaceClass() {
        return functionalInterfaceClass;
    }
    public String getFunctionalInterfaceMethodName() {
        return functionalInterfaceMethodName;
    }
    public String getFunctionalInterfaceMethodSignature() {
        return functionalInterfaceMethodSignature;
    }
    public String getImplClass() {
        return implClass;
    }
    public String getImplMethodName() {
        return implMethodName;
    }
    public String getImplMethodSignature() {
        return implMethodSignature;
    }
    public int getImplMethodKind() {
        return implMethodKind;
    }
    public final String getInstantiatedMethodType() {
        return instantiatedMethodType;
    }
    public int getCapturedArgCount() {
        return capturedArgs.length;
    }
    public Object getCapturedArg(int i) {
        return capturedArgs[i];
    }
}


public class SerializationSolution {

    @FunctionalInterface
    public interface CustomerFunction<S, T> extends Serializable {

        T convert(S source);
    }

    public static void main(String[] args) throws Exception {
        CustomerFunction<String, Long> function = Long::parseLong;
        cn.vlts.SerializedLambda serializedLambda = getSerializedLambda(function);
        System.out.println(serializedLambda.getCapturingClass());
    }

    private static cn.vlts.SerializedLambda getSerializedLambda(Serializable serializable) throws Exception {
        try (ByteArrayOutputStream baos = new ByteArrayOutputStream();
             ObjectOutputStream oos = new ObjectOutputStream(baos)) {
            oos.writeObject(serializable);
            oos.flush();
            try (ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(baos.toByteArray())) {
                @Override
                protected Class<?> resolveClass(ObjectStreamClass desc) throws IOException, ClassNotFoundException {
                    Class<?> klass = super.resolveClass(desc);
                    return klass == java.lang.invoke.SerializedLambda.class ? cn.vlts.SerializedLambda.class : klass;
                }
            }) {
                return (cn.vlts.SerializedLambda) ois.readObject();
            }
        }
    }
}

Forgotten $deserializeLambda $method

As mentioned earlier, the java.lang.invoke.SerializedLambda.readResolve() method will be called when the Lambda expression instance is deserialized. The magic thing is that the source code of this method is as follows:

private Object readResolve() throws ReflectiveOperationException {
    try {
        Method deserialize = AccessController.doPrivileged(new PrivilegedExceptionAction<>() {
            @Override
            public Method run() throws Exception {
                Method m = capturingClass.getDeclaredMethod("$deserializeLambda$", SerializedLambda.class);
                m.setAccessible(true);
                return m;
            }
        });

        return deserialize.invoke(null, this);
    }
    catch (PrivilegedActionException e) {
        Exception cause = e.getException();
        if (cause instanceof ReflectiveOperationException)
            throw (ReflectiveOperationException) cause;
        else if (cause instanceof RuntimeException)
            throw (RuntimeException) cause;
        else
            throw new RuntimeException("Exception in SerializedLambda.readResolve", e);
    }
}

It seems that there is such a static method in the capture class:

class CapturingClass {

    private static Object $deserializeLambda$(SerializedLambda serializedLambda){
        return [serializedLambda] => Lambda Expression instance;
    }  
}

You can try to retrieve the list of methods in the capture class:

public class CapturingClassApp {

    @FunctionalInterface
    public interface CustomRunnable extends Serializable {

        void run();
    }

    public static void main(String[] args) throws Exception {
        invoke(() -> {
        });
    }

    private static void invoke(CustomRunnable customRunnable) throws Exception {
        Method writeReplaceMethod = customRunnable.getClass().getDeclaredMethod("writeReplace");
        writeReplaceMethod.setAccessible(true);
        java.lang.invoke.SerializedLambda serializedLambda = (java.lang.invoke.SerializedLambda)
                writeReplaceMethod.invoke(customRunnable);
        Class<?> capturingClass = Class.forName(serializedLambda.getCapturingClass().replace("/", "."));
        ReflectionUtils.doWithMethods(capturingClass, method -> {
                    System.out.printf("Method name:%s,Modifier :%s,Method parameter list:%s,Method return value type:%s\n", method.getName(),
                            Modifier.toString(method.getModifiers()),
                            Arrays.toString(method.getParameterTypes()),
                            method.getReturnType().getName());
                },
                method -> Objects.equals(method.getName(), "$deserializeLambda$"));
    }
}

// results of enforcement
 Method name:$deserializeLambda$,Modifier :private static,Method parameter list:[class java.lang.invoke.SerializedLambda],Method return value type:java.lang.Object

If there is a method to convert the SerializedLambda instance of the "capture class" into the Lambda expression instance, which is consistent with the description of the java.lang.invoke.SerializedLambda annotation mentioned earlier, because no trace of this method can be found in many places. It is speculated that $deserializeLambda $is a method generated by VM and can only be called through reflected methods, It's a hidden skill.

Summary

The Lambda expression function in JDK has been released for many years. It's unexpected to find out its serialization and deserialization methods after so many years. Although this is not a complex problem, it is an interesting knowledge point seen recently.

reference material:

  • JDK11 source code
  • Mybatis plus related source code

(end of this paper e-a-20211127 c-2-d)

Tags: Java

Posted on Tue, 30 Nov 2021 21:00:47 -0500 by echoofavalon