How to convert PyTorch model to TensorRT
I have written a detailed article before, which introduces how to convert from pytorch - > onnx - > tensorrt. It will not be repeated here. Interested readers please click to view: https://editor.csdn.net/md/?articleId=117778308
IndexError: Attribute not found: axes
IndexError: Attribute not found: axes
I searched the Internet for the error of Attribute not found. Most of the relevant posts are related to the Pad operator. Later, I found an error report about axes on CSDN. Thank the blogger for helping me locate the problem smoothly. The link to the post is also posted here: https://blog.csdn.net/ChuiGeDaQiQiu/article/details/119821974
The article points out that this error is related to the operator squeeze. If you do not name a specific axis of squeeze during the use of PyTorch, all axes with dimension 1 in the tensor will be squeezed, which just meets the functions I need in the following definitions.
class ClassificationHead(nn.Module): def __init__(self, num_classes): super().__init__() self.initial_block = DownsamplerBlock(128,256) self.layers = nn.ModuleList() for x in range(0, 5): self.layers.append(non_bottleneck_1d(256, 0.3, 1)) self.layers.append(DownsamplerBlock(256, 512)) for x in range(0, 2): self.layers.append(non_bottleneck_1d(512, 0.3, 1)) self.pool = nn.AdaptiveAvgPool2d((1,1)) self.fc = nn.Linear(512, num_classes) def forward(self, input): output = self.initial_block(input) for layer in self.layers: output = layer(output) output = self.pool(output) output = self.fc(output.squeeze())#There is an error in this line return output
Later, I thought about the following. After an adaptive avgpool2d, the dimension of the tensor should be [batch_size, channels, 1,1]. If you want to use squeeze, you want to remove the last two dimensions of the tensor and calculate the full connection layer.
Change the line to read:
output = self.fc(output.flatten(start_dim=1))#There is an error in this line
For the use of flatten, please refer to the official link of pytorch: https://pytorch.org/docs/stable/generated/torch.flatten.html
start_ The parameter dim = 1 restricts the flatten operation from the first dim instead of the 0 dim, which can meet the above requirements.
After this line is changed, the error report disappears.
MemoryError: std::bad_alloc
In the process of ONNX to TensorRT, the second error is encountered
MemoryError: std::bad_alloc
The last error reported by trace is
[TensorRT] ERROR: ArgMax_255: at least 2 dimensions are required for input.
This error is presumed to be due to the use of torch.argmax in the network, but tensorRT does not support this operator? This is not very certain, but after removing torch.argmax in the network, the error message disappears.
ERROR: Try increasing the workspace size with IBuilderConfig::setMaxWorkspaceSize() if using IBuilder::buildEngineWithConfig, or IBuilder::setMaxWorkspaceSize() if using IBuilder::buildCudaEngine
The complete error message for this error is
[TensorRT] ERROR: Try increasing the workspace size with IBuilderConfig::setMaxWorkspaceSize() if using IBuilder::buildEngineWithConfig, or IBuilder::setMaxWorkspaceSize() if using IBuilder::buildCudaEngine. [TensorRT] ERROR: ../builder/tacticOptimizer.cpp (1820) - TRTInternal Error in computeCosts: 0 (Could not find any implementation for node ConvTranspose_132.) [TensorRT] ERROR: ../builder/tacticOptimizer.cpp (1820) - TRTInternal Error in computeCosts: 0 (Could not find any implementation for node ConvTranspose_132.) Completed creating Engine
At the beginning, I mainly suspected that the problem was solved by the following course not find any implementation for node convtransfer_ 132. After online search, the problem may mainly appear in the first error, that is, Try increasing the workspace size.
The main reminder of the online answer is, max_workspace_size, set it to = 1 < < 30.
Some codes are intercepted here to illustrate the problem:
with trt.Builder(TRT_LOGGER) as builder, \ builder.create_network(explicit_batch) as network, \ trt.OnnxParser(network, TRT_LOGGER) as parser, \ builder.create_builder_config() as config: # Use onnx's parser to bind the calculation graph, and the calculation graph will be filled through parsing later profile = builder.create_optimization_profile() profile.set_shape("inputs", (1,3,512,512),(1,3,512,512),(1,3,512,512)) config.add_optimization_profile(profile) builder.max_workspace_size = 1 << 30 # The pre allocated workspace size, that is, the maximum space required by the GPU when the ICudaEngine is executed builder.max_batch_size = max_batch_size # The maximum batch size that can be used during execution builder.fp16_mode = fp16_mode builder.int8_mode = int8_mode
Note this sentence in the parameter setting:
builder.max_workspace_size = 1 << 30 # The pre allocated workspace size, that is, the maximum space required by the GPU when the ICudaEngine is executed
After reading this post, I noticed the problem. Post link: https://forums.developer.nvidia.com/t/tensorrt-engine-cannot-be-built-due-to-workspace-size-even-if-its-set-higher/170898/3
I fixed the workspace adjustment to be applied to the config instead of the builder
In other words, the above sentence should be changed to
config.max_workspace_size = 1 << 30 # The pre allocated workspace size, that is, the maximum space required by the GPU when the ICudaEngine is executed
In order to correctly set the maximum space required by the GPU.