AF3 _realign_pdb_template_to_query 函数解读
AlphaFold3 中templates模块的_realign_pdb_template_to_query函数是将模板序列与查询序列重新比对,以确保模板来源于最新的 mmCIF 结构文件,而非模板数据库(如 PDB70)中可能存在的过时序列。 在 AlphaFold3 中,模板通常通过 HHsearch 等工具搜索得到,但这些数据库中的模板序列可能与 PDB 官方数据库的最新 mmCIF 文件存在差异。因此,需要使用 Kalign 重新比对,从而获得更准确的模板序列和映射关系。
源代码:
def _realign_pdb_template_to_query(
old_template_sequence: str,
template_chain_id: str,
mmcif_object: mmcif_parsing.MmcifObject,
old_mapping: Mapping[int, int],
kalign_binary_path: str,
) -> Tuple[str, Mapping[int, int]]:
"""Aligns template from the mmcif_object to the query.
In case PDB70 contains a different version of the template sequence, we need
to perform a realignment to the actual sequence that is in the mmCIF file.
This method performs such realignment, but returns the new sequence and
mapping only if the sequence in the mmCIF file is 90% identical to the old
sequence.
Note that the old_template_sequence comes from the hit, and contains only that
part of the chain that matches with the query while the new_template_sequence
is the full chain.
Args:
old_template_sequence: The template sequence that was returned by the PDB
template search (typically done using HHSearch).
template_chain_id: The template chain id was returned by the PDB template
search (typically done using HHSearch). This is used to find the right
chain in the mmcif_object chain_to_seqres mapping.
mmcif_object: A mmcif_object which holds the actual template data.
old_mapping: A mapping from the query sequence to the template sequence.
This mapping will be used to compute the new mapping from the query
sequence to the actual mmcif_object template sequence by aligning the
old_template_sequence and the actual template sequence.
kalign_binary_path: The path to a kalign executable.
Returns:
A t