Collections:
Motif Counts and Consensus with Bio.motifs
How to Get Motif Counts and Consensus with Bio.motifs Module?
✍: FYIcenter.com
Motif counts represent how often each letter appears at each position
in a motif sample set.
Motif counts is also called PFM (Position Frequency Matrix).
Motif consensus is the sequence of letters along the positions of the motif for which the largest value in the corresponding columns of the motif count is obtained. Basically, the motif consensus is the sequence with highest probability based on the given motif sample set. Or the motif consensus is the most likely sequence appearing in the entire population.
Motif anticonsensus is the sequence of letters along the positions of the motif for which the smallest value in the corresponding columns of the motif count is obtained. Basically, the motif consensus is the sequence with lowest probability based on the given motif sample set. Or the motif consensus is the most unlikely sequence appearing in the entire population.
1. Create a motif object with 7 sequences that matches the motif pattern of "[AT]A[CT][ACG][AC]".
fyicenter$ python >>> from Bio import motifs >>> samples = [ ... "TACAA", ... "TACGC", ... "TACAC", ... "TACCC", ... "AACCC", ... "AATGC", ... "AATGC" ... ] >>> m = motifs.create(samples)
2. View motif counts.
>>> print(m.counts) 0 1 2 3 4 A: 3.00 7.00 0.00 2.00 1.00 C: 0.00 0.00 5.00 2.00 6.00 G: 0.00 0.00 0.00 3.00 0.00 T: 4.00 0.00 2.00 0.00 0.00
3. View motif consensus.
>>> print(m.consensus) TACGC
4. View motif anticonsensus.
>>> print(m.anticonsensus) CCATG
5. If a position has multiple letters with same highest count, Biopython will select one of those letters.
>>> samples = [ ... "TACAA", ... "TACGC", ... "TACAC", ... "TACCC", ... "AACCC", ... "AATGC", ... "AATGC", ... "AACGC" ... ] >>> m = motifs.create(samples) >>> print(m.consensus) AACGC
As you can see, position 1 has both A and T with the highest count of 4. Biopython selects A.
⇒ Read Motif in JASPAR Format with Bio.motifs
⇐ Create Motif With Biopython Bio.motifs Module
2023-07-05, 707🔥, 0💬
Popular Posts:
Molecule Summary: ID: FYI-1003358 Names: InChIKey: DGIGXLXLGBAJJN-TUOUHCSQS A-NSMILES: CC(=O)OC(C)(C...
Molecule Summary: ID: FYI-1002107 Names: InChIKey: OWVQNGUYRPHODI-UHFFFAOYS A-NSMILES: C=CC(=O)CCCn1...
Molecule Summary: ID: FYI-1000425 SMILES: O=C1CCNCC1 Received at FYIcenter.com on: 2021-06-23
Molecule Summary: ID: FYI-1003839 Names: InChIKey: LKCTWIIDXXXXAR-UHFFFAOYS A-NSMILES: C/C(C)=C\\CCC...
Molecule Summary: ID: FYI-1003643 Names: InChIKey: PUINHTIQEGQFLQ-UHFFFAOYS A-NSMILES: Cn1nccc1c2cc(...