[DICOM活久见-2]认识DICOM的多帧图像,并且用pydicom拆分为单帧图像
#1024程序员节|征文#
1. 问题背景:
近期公司的AI产品在上线一家县城医院时,遇到了一个棘手的问题 —— CT的多帧图像。
多帧图像我是知道的,主要出现在超声等成像设备中,因为要捕捉动态变化过程,如心脏的收缩和舒张,血液的流动等。但是,在CT图像里面遇到多帧图像,还是比较少见的。
在我做CT影像的处理软件的过程中,这是第二次遇到。由于第一次遇到时,医院的数据量比较小,可以手动帮忙处理一下。但这次的情况稍有不同,一是医院有竞品的存在,竞品可以正常地处理多帧图像;二是数据量也较大,只要是某一台CT设备拍出来的影像,无论是胸肺,冠脉,头颈,还是肝胆,泌尿等,都是多帧图像。不可能再去手动地处理了。
因此,急需在现有的处理流程中,加入对多帧图像的支持。而处理的第一步,就是需要将一个序列内部的一张多帧图像,按照内部的顺序,拆分为帧数对应的单帧图像序列。
通过在网上查询,基于java的dcm4chee和基于C++的dcmtk, 都有直接处理多帧图像的函数。
dcm4chee:
https://github.com/dcm4che/dcm4che/blob/master/dcm4che-tool/dcm4che-tool-emf2sf/README.md
dcmtk:
https://stackoverflow.com/questions/68440620/multiframe-ultrasound-dicom-file-creation
pydicom:
https://gist.github.com/pangyuteng/803cf2fda3d7568aaa348fdd409937d7
DICOM标准中,关于多帧图像的介绍,有一张图,可以比较形象地表现出来。
https://dicom.nema.org/medical/dicom/current/output/chtml/part03/sect_C.7.6.16.html#sect_C.7.6.16.1.1
在这一幅图中,展示了一张多帧图像,它内部包含的属性(Attributes)中,与每一帧图像息息相关的属性。在这其中,最重要的是两个属性:
图片来源于网站: https://dicom.innolitics.com/ciods/enhanced-ct-image/enhanced-ct-image-multi-frame-functional-groups/52009230
属性 | Group, Element | |
---|---|---|
Shared Functional Groups Sequence | (5200,9229 | Sequence that contains the Functional Group Macros that are shared for all frames in this SOP Instance and Concatenation. |
Per-Frame Functional Groups Sequence | 5200,9230 | Sequence that contains the Functional Group Sequence Attributes corresponding to each frame of the Multi-frame Image. The first Item corresponds with the first frame, and so on. |
One or more Items shall be included in this Sequence. The number of Items shall be the same as the number of frames in the Multi-frame image|
顾名思义,Shared Functional Groups Sequence, 是一个序列内所有帧图像共同享有的tag信息;
而Per-Frame Functional Groups Sequence, 就是每一帧独有的信息。因此,Per-Frame的数量,和一张多帧图像的帧数(NumberOfFrames), 应该是相等的。
而且,从它在这个DICOM Standard Browser的网站中的树状结构中能看出,它应该是属于增强CT图像,相比于一般的平扫CT图像来说。
以下,我以一张具体的多帧图像的例子,来进行说明。
2. 解决方案:
以下是基于pydicom来进行解读和拆分:
import pydicom
file = "/path/to/multiFrameImage.dcm"
ds = pydicom.dcmread(file)
print(ds.NumberOfFrames)
# output: 494
# 说明这一张多帧图像中,包含了494帧.
2.1 Shared Functional Groups Sequence
从这幅图中可以看出,一张多帧图像中,所有帧图像,共用享有的Tag包括哪些:
Sequence | Tag |
---|---|
CT Acquisition Type Sequence | AcquisitionType, … |
CT Acquisition Details Sequence | |
CT Table Dynamics Sequence | |
CT Geometry Sequence | |
CT Reconstruction Sequence | Convolutional Kernel Group, … |
CT X-Ray Details Sequence | |
CT Image Frame Type Sequence | Frame Type, Pixel Representation, Volumetric Propetries, … |
Frame Anatomy Sequence | |
Pixel Measures Sequence | Slice Thickness, Pixel Spacing |
Frame VOI LUT Sequence | Window Center, Window Width |
Pixel Value Transformation Sequence | Rescale Intercept, Rescale Slope, Rescale Type |
2.2 Per Frame Functional Groups Sequence
以下则是解析: PerFrameFunctionalGroupsSequence的内容:
Sequence | Tag |
---|---|
CT Exposure Sequence | |
CT Position Sequence | |
Frame Content Sequence | |
Plane Position Sequence | Image Position Patient |
Plane Orientation Sequence | Image Orientation Patient |
之前以为InstanceNumber, 也是属于每一个Frame里面的属性。但这样查看后,InstanceNumber, 并没有在Per-Frame里面,也当然没有在Shared里面。
因此,在拆分多帧图像为单帧图像时,如果需要给每一张图像赋予InstanceNumber, 需要在遍历Frames的过程中,给每一个Frame, 设置新的InstanceNumber.
2.3 Pixel Array
如果是单帧图像,那么通过pydicom读取图像后,获取到图像的pixel_array, 和多帧图像的pixel_array, 是不同的。
主要区别在于shape的不同。
2.4 多帧图像拆分为单帧图像
import os
import pydicom
from typing import Tuple, Union, Optional
from pydicom import Dataset, FileDataset, dcmread
def split_multi_frame_to_single():
file_path = "/path/to/multiFrames.dcm"
multi_img_dataset = pydicom.dcmread(file_path)
number_of_frames = multi_img_dataset.NumberOfFrames
series_img_array = multi_img_dataset.pixel_array
base_sop_iuid = multi_img_dataset.SOPInstanceUID
for i in range(number_of_frames):
# 每一张dcm文件,都必须有头信息.
cur_file_meta = multi_img_dataset.file_meta
# 每一张dcm文件,为了表明自己是dcm文件,需要有一个preamble, 里面包含了DCM的含义.
cur_preamble = multi_img_dataset.preamble
# 每一张dcm文件,都需要有SOPInstanceUID, 这里以多帧图像的SOPInstanceUID, 末尾加序号的方式,生成新的SOP_IUID.
new_sop_iuid = f"{base_sop_iuid}.{i}"
output_file_name = os.path.join("output_dir", new_sop_iuid)
ds = FileDataset(output_file_name, DS_simple, file_meta=cur_file_meta, preamble=cur_preamble)
ds.SOPInstanceUID = new_sop_iuid
curr_array = series_img_array[i]
# 将像素值,转化为Bytes后,存储到PixelData的tag中.
ds.PixelData = curr_array.tobytes()
ds.Rows, ds.Columns = curr_array.shape
ds.file_meta.TransferSyntaxUID = pydicom.uid.ImplicitVRLittleEndian
# 由于医学数据有三维空间信息和方位信息,因此需要从多帧图像的PerFrameFunctionalGroupsSequence中,获取到每一个Frame中的ImagePositionPatient和ImageOrientationPatient的值。
ds.ImagePositionPatient, ds.ImageOrientationPatient = _get_ImagePosition_ImageOrientation(
multi_img_dataset, i
)
# SliceLocation和ImagePositionPatient的z值,一般都要保持一致.
ds.SliceLocation = ds.ImagePositionPatient[-1]
ds.InstanceNumber = i + 1
# 这里之所以赋予了一个AcquisitionNumber的值,是为了区分这是哪一次扫描.
# 因为有的一个序列里面,可能会包含多个多帧的图像。这样为了后续便于区分,是从哪一个多帧图像中,拆分出来的单帧图像,便可以给每一张解析出来的图像,赋予一个AcquisitionNumber.
ds.AcquisitionNumber = mul_acquisition_number
pydicom.save_as("/output/single_frame_{i}.dcm")
mul_acquisition_number += 1
def _generate_template_dataset(dicom: FileDataset, update_tags: Union[dict, None] = None):
DS = Dataset()
DS.ContentDate = dicom.ContentDate
DS.ContentTime = dicom.ContentTime
DS.ImageType = dicom.ImageType
DS.SOPClassUID = dicom.SOPClassUID
DS.StudyDate = dicom.StudyDate
DS.SeriesDate = dicom.SeriesDate
DS.AcquisitionDate = dicom.AcquisitionDate
DS.Manufacturer = dicom.Manufacturer
DS.AccessionNumber = dicom.AccessionNumber
DS.Modality = dicom.Modality
DS.StationName = dicom.StationName
DS.SeriesNumber = dicom.SeriesNumber
DS.SeriesDescription = dicom.SeriesDescription
DS.ManufacturerModelName = dicom.ManufacturerModelName
DS.PatientName = dicom.PatientName
DS.PatientID = dicom.PatientID
DS.SeriesInstanceUID = dicom.SeriesInstanceUID
DS.StudyInstanceUID = dicom.StudyInstanceUID
DS.FrameOfReferenceUID = dicom.FrameOfReferenceUID
DS.BitsStored = dicom.BitsStored
DS.BitsAllocated = dicom.BitsAllocated
DS.SamplesPerPixel = dicom.SamplesPerPixel
DS.HighBit = dicom.HighBit
DS.SeriesDescription = dicom.SeriesDescription
DS.PhotometricInterpretation = dicom.PhotometricInterpretation
DS.PixelRepresentation = dicom.PixelRepresentation
# 这里主要是将多帧图像中,所有帧图像都共同使用的Tag,赋予每一个拆分出来的单帧图像.
shared_functional_groups_sequence = _get_slice_SharedFunctionalGroupsSequence(dicom)
for key in shared_functional_groups_sequence.keys():
DS[key] = shared_functional_groups_sequence[key]
if isinstance(update_tags, dict):
for key in update_tags.keys():
try:
DS[key] = update_tags[key]
except Exception as e:
print(e.__traceback__)
return DS
def _get_slice_SharedFunctionalGroupsSequence(input_dicom: FileDataset) -> dict:
try:
result = {}
pixel_measures_sequence = input_dicom.SharedFunctionalGroupsSequence[0].PixelMeasuresSequence[0]
FrameVOILUT = input_dicom.SharedFunctionalGroupsSequence[0].FrameVOILUTSequence[0]
for key in pixel_measures_sequence.keys():
result[key] = pixel_measures_sequence[key]
for key in FrameVOILUT.keys():
result[key] = FrameVOILUT[key]
return result
except AttributeError as e:
print(e)
print("输入的文件不是处理对象类型")
return None
except Exception as e:
print(e.__traceback__)
return None
def _get_ImagePosition_ImageOrientation(inputDicom: FileDataset, i):
try:
cur_imagepositionpatient = (
inputDicom.PerFrameFunctionalGroupsSequence[i].PlanePositionSequence[0].ImagePositionPatient
)
cur_ImageOrientationPatient = (
inputDicom.PerFrameFunctionalGroupsSequence[i].PlaneOrientationSequence[0].ImageOrientationPatient
)
return cur_imagepositionpatient, cur_ImageOrientationPatient
except AttributeError:
print(AttributeError)
print("输入的文件不是处理对象类型")
return None, None
except Exception as e:
print(e.__traceback__)
return None, None
参考链接:
1, DICOM STANDARD Browser:
https://dicom.innolitics.com/ciods/enhanced-ct-image/enhanced-ct-image-multi-frame-functional-groups/52009230
2, DICOM STANDARD part03:
https://dicom.nema.org/medical/dicom/current/output/chtml/part03/sect_C.7.6.16.html#sect_C.7.6.16.1.2
完