Part A (From Pranam)

<aside> ⚠️ Optional for MIT/Harvard Students and Committed Listeners. Due at the start of class March 12

</aside>

  1. Sign up for HuggingFace and request access for PepMLM:

    https://huggingface.co/ChatterjeeLab/PepMLM-650M

  2. Find the amino acid sequence for SOD1 in UniProt (ID: P00441), a protein when mutated, can cause Amyotrophic lateral sclerosis (ALS). In fact, the A4V (when you change position 4 from Alanine to Valine) causes the most aggressive form of ALS, so make that change in the sequence

  3. Enter your mutated SOD1 sequence into the PepMLM inference API and generate 4 peptides of length 12 amino acids (Step 5 takes a while so you can also just pick 1 or 2 peptides)

  4. To your list, add this known SOD1-binding peptide to your list: FLYRWLPSRRGG [from -https://genesdev.cshlp.org/content/22/11/1451]

  5. Go to AlphaFold-Multimer (https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb). This is similar to what you did for homework last week but instead for a protein-peptide complex

  6. After running AlphaFold-Multimer with your 5 peptides alongside your mutated SOD1 sequence, plot the ipTM scores, which measures the relative confidence of the binding region.

  7. Provide a 1 paragraph write-up of your results and be sure to cite the relevant works

  8. Part B (Final Project: L-Protein Mutants)

<aside> ⚠️ Mandatory for MIT/Harvard Students and Committed Listeners. Due at the start of class March 12 This homework requires computation that might take you a while to run. So please get started early.

</aside>

Bacteriophage MS2 is a single stranded RNA virus whose genome only encodes 4 proteins -the maturation protein (A-protein), the lysis (L-Protein) protein, the coat protein (cp), and the replicase (rep) protein. Bacteriophages infect E-coli. Upon infection, the L-Protein forms pores in the E-coli cell membrane which eventually leads to breakdown of the membrane (Lysis). DnaJ is a chaperone protein in E-coli (chaperone proteins are proteins that assist during protein folding). It is thought to be involved in the lysis mechanism. In this homework, we will explore if computational models we learnt about in the last class are useful for designing variants/mutants of the lysis protein sequence. We will study the effects of L-protein mutants on the bacteriophage infectivity.

You can read more about the final project in the Final Project Page.

Untitled

Goal:

Our goal for this part of the homework is to create mutants of L-protein that affect its lysis activity and/or its interaction with DNAj. Making a mutation for L-protein without a way to computationally predict what happens to lysis or its interaction with DNAj is hard. So we are going to try various hypotheses on how to use the models from last week and also try a few other tools. These mutants will be tested in the lab.

L-Protein and DNAj Sequence

Lysis Protein Sequence (UniProtKB ID: https://www.uniprot.org/uniprotkb/P03609/entry)

METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT