[DL] TFLite 양자화 적용 상태 확인 방법

AI/Deep Learning

[DL] TFLite 양자화 적용 상태 확인 방법

운호(Noah) 2022. 6. 6. 14:05

들어가기 앞서,

TFLite는, interpreter의 input type과 output type을 통해 양자화 적용 상태를 확인할 수 있습니다.

예제 코드

import tensorflow as tf

interpreter = tf.lite.Interpreter(model_path="tflite모델경로")

input_type = interpreter.get_input_details()[0]['dtype']
print('input: ', input_type)
output_type = interpreter.get_output_details()[0]['dtype']
print('output: ', output_type)

# FP32 이므로, 양자화 없이 TFLite로 변환된 모델인 것을 확인할 수 있습니다.
input:  <class 'numpy.float32'>
output:  <class 'numpy.float32'>

추가

FP16 으로 양자화한 모델의 input type 과 output type 이 ‘numpy.float32’ 로 나온다면,
FP32 모델의 저장 파일 크기와 FP16 모델의 저장 파일 크기를 비교하시면 됩니다.

정상적으로 양자화됐을 경우, FP16 모델의 크기는 FP32 모델 크기의 절반이 됩니다.

  ls -lh

  # FP32 모델의 크기는 83k
  -rw-rw-r-- 1 kbuilder kbuilder 83K Aug 14 00:42 mnist_model.tflite
  # FP16 모델의 크기는 44k
  -rw-rw-r-- 1 kbuilder kbuilder 44K Aug 14 00:42 mnist_model_quant_f16.tflite

참고

https://www.tensorflow.org/lite/performance/post_training_integer_quant?hl=ko