Journal of Youths in Science
VOLUME 10 ISSUE 1
Using Machine Learning to Play Video Games by Daniel Liu
Hardwater Effects on Hair by Dahun Ryu
CAR T-Cell against Cancer
30 by Clyde Xu
Art by Seyoung Lee
Torrey Pines High School Scripps Ranch High School Duxbury High School Ed W. Clark High School
LAMP High School Tomball Memorial High School Washington High School Westminster High School
17 21 Contact us if you are interested in becoming a new member or starting a chapter. or if you have any questions or comments. Website: www.journys.org // Email: email@example.com Journal of Youths in Science Attn: Mary Anne Rall 3710 Del Mar Heights Road San Diego, CA 92130 2 | JOURNYS | FALL 2018
Journal of Youths in Science issue 10.1-winter 2018
Isolating Exosomes Derived From Human Natural Killer Cells for Characterization Wesley Huang
RT-Spectrum of Oral Bacteria by Age and Inhibitory Effect of Natural Oriental Herbals Do Hun Kim, Dahun Ryu
Using Machine Learning to Play Video Games Daniel Liu
Evaluating the Effects of Binding Pocket Mutants in the PKA Regulatory Domain Vainavi Viswanath, Benjamin Konecny
Computation Linguistics: Stylometric Algorithms for Authorship Attribution Pulkit Rampa
Effects of hardwater on hair Dahun Ryu CAR T-Cell Immunotherapy Against Cancers Clyde Xu 3 | JOURNYS | FALL 2018
by Wesley Huang
Art by Seyoung Lee
Exosomes have been an area of great interest in the field of cellular biology, as they appear to have a role in many essential cellular processes. Natural killer (NK) cell-derived exosomes are especially of interest, as they have been noted for their cytotoxicity against cancer cells and may have potential in immunotherapy. However, current methods of isolating exosomes have many limitations. We present a more effective, efficient, and simple method to isolate large quantities of NK cell-derived exosomes using ÄKTA start, a protein purification system centered around size-
exclusion chromatography. Using the ÄKTA system, we obtained a chromatogram of high resolution depicting the UV absorbance of exosomes. The presence and identity of exosomes was confirmed via probing for markers such as CD56 and CD81 and cytotoxic proteins such as granzyme B in the exosomes. Their cytotoxic properties against cancer cells were validated through luciferase assays. By incubating exosomes, cancer cells, and an array of inhibitors, we also found that NK exosomes may engage in cytotoxicity via different methods such as ligand-receptor interactions.
Exosomes are a subset of extracellular vesicles (EVs) that are released in vitro and in vivo by several types of cells. They differ from the other EVs, namely microvesicles and apoptotic bodies in their size, secretion mechanism, and functional properties. Exosomes are implicated in communication between cells and in the movement of molecules such as proteins and nucleic acids. Additionally, they contain proteins, microRNAs, and mRNAs specific to the origin of the cell. For example, dendritic cell-derived exosomes contain MHC II-peptide complexes, while exosomes from tumor cells express tumor antigens. Because exosomes are a means of exchanging these proteins and RNA between cells, they can be used as devices for communication, thus allowing for significant influence in numerous cell functions and processes. Current research suggests that exosomes exert their effects on other cells through uptake mechanisms, including endocytic pathways and cell surface membrane fusion . Of special interest are exosomes derived from natural killer cells due to their cytotoxic properties. Natural killer (NK) cells are cytolytic cells derived from
lymphoid stem cells that function in our bodies’ innate immune system, providing defense against tumors and infections. They are also able to mediate antibody-dependent cellular cytotoxicity (ADCC). Previous studies have shown that large numbers of activated NK (aNK) cells can be grown ex vivo from peripheral blood mononuclear cells (PBMC) by coculturing with artificial antigen-presenting cells (aAPC). aNK cells are highly cytotoxic and are able to secrete cytokines and chemokines with anti-tumor potential . Exosomes released by NK cells contain many NK cell markers such as CD56 and FASL. A recent study has shown that exosomes released by these aNK cells contain cytotoxic molecules and exhibit cytotoxicity towards cancer cell lines when isolated, providing the basis for this paper. The study also confirmed that aNK EV-induced cytotoxicity was caspasemediated and attributed the results to the caspase activating properties of granzyme B, meaning that the exosome must have been uptaken and internalized . In any case, aNK exosomes are promising targets for future development of anti-cancer therapeutics.
4 | JOURNYS | FALL 2018
In order to investigate and understand the roles of NK-derived exosomes, we must first be able to isolate and analyze these exosomes. Several methods of isolating exosomes have been described to date. The most commonly used one is ultracentrifugation, as despite its limitations such as possible disruption and aggregation of exosomes and low yield, it is very straightforward and simple. Exosomes have also been isolated using precipitation with agglutinating agents. The exosome yield is high but is not as pure. Filtration is another method where though it is scalable and simple, it has limitations, which include clogging and non-specific binding of other proteins. Newer filtration methods such as microfluidic filtration might be able to solve these issues; however, it produces diluted exosomes that need to be concentrated, resulting in some loss of the sample. Methods based on chromatography are also becoming more popular. Size-exclusion chromatography is one of the most widely used because it purifies functional exosomes and is reproducible. On the other hand, the use of immunoaffinity chromatography requires an antibody of the highest specificity but may be able to differentiate and isolate different tissue-specific exosomes. Each method possesses limitations and advantages, and still, an effective and efficient method to isolate biologically active exosomes is unavailable . In this report, we propose a very efficient and effective method of isolating large amounts of functional aNK exosomes through the use of the ÄKTA start system. By further analyzing these exosomes, we provide new insight into the characteristics of aNK cell-derived exosomes. Moreover, we show that aNK exosomes may utilize more than just uptake pathways in order to engage in cytotoxicity.
In order to isolate aNK exosomes from human natural killer cells, the procedure as detailed in a recent study for NK cell propagation and activation was followed . 30 to 50 mL of human blood was provided by donors from the hospital associated with our research institution. Human peripheral blood mononuclear cells (PBMC) were then grown in cell culture in vitro. aAPCs were γ-irradiated and then added to RPMI medium and fetal bovine serum (FBS) supplemented with recombinant human interleukin-2. The PBMC and the aAPCs were
grown in tissue culture for 21 days. T h e medium was then centrifuged and the resulting supernatant containing all extracellular vesicles secreted by the NK cells was collected. We connected a 120 mL column prepared with Sephacryl 300 (S-300) beads to the ÄKTA start system, and the system settings were set at a flow rate of 0.5 mL per minute and the fraction sizes were 500 µl each. Similarly, a 30 mL column prepared with Sephacryl 200 (S-200) beads was run with a flow rate of 0.5 mL per minute and the fraction sizes collected were one mL. In running a sample through the system, an equipped spectrophotometer produces a chromatogram for ultraviolet absorbance to detect proteins. Samples were eluted with iced phosphate-buffered saline (PBS). The fractions were automatically collected by the rotating fraction collector. The 120 mL column was used for samples used in the CD56 and granzyme B enzyme-linked immunosorbent assays (ELISA) and the cytotoxicity assays. Fraction samples #1-24 were pooled into eight fractions based on the chromatogram and concentrated 600-fold using Amicon Ultra centrifugal filters. The samples from the 30 mL column were used to run the CD81 western b l o t .
5 | JOURNYS | FALL 2018
For our western blots, we used thirty µg of protein per lane for separation via SDS-PAGE and used the ThermoFisher Scientific Pierce™ Power Blotter to electrotransfer the proteins from the gel to a polyvinylidene difluoride (PVDF) membrane, which was blocked in 5% non-fat milk in PBS. The membrane
was then probed with the primary antibodies of the specific protein of interest and consequently with anti-rabbit-HRP secondary antibody conjugates. The film was developed using a chemiluminescent substrate. A protein size ladder was used as a standard for the determination of molecular weight.
Granzyme B and CD56 levels were quantified using a granzyme B and a CD56 commercial ELISA kit. We performed each assay according to the manufacturer’s protocol. We incubated the sample in wells, washed, incubated with primary antibody, washed, incubated
with HRP conjugate, washed, incubated with TMB substrate, added stop solution, and read the results using a plate reader. For the granzyme B ELISA, a standard for granzyme B was also used. An equal amount of exosome was used for each sample.
Cytotoxicity assays were performed in triplicate. In the first two, we tested the cytotoxicity of exosomes with CHLA-255 neuroblastoma cells and with Sup-B15 leukemia cells. Both cell lines were transfected with the firefly luciferase gene, and the number of live cells was measured by means of luminescence after adding D-luciferin. We incubated exosomes and the cancer cells in 96-well plates for 24 hours and quantified the flux of live cells using the Promega Glomax multidetection system. Controls consisted of either CHLA255 or Sup-B15 cells only. Our final assay utilized
inhibitors, as well as exosomes and cancer cells. CHLA-255 cells that also were firefly luciferase labeled were incubated 30 minutes prior to the addition of exosomes with the inhibitors of 10 nM filipin, 10 ng/ mL cytochalasin D, or 50 uM EIPA and during the interaction with exosomes for 24 hours. This assay was analyzed in the same manner as the previous two, quantifying bioluminescence using the Promega Glomax multi-detection system. Controls consisted of only cancer cells and inhibitor, ensuring that the inhibitors themselves are not able to kill the cells.
After co-culturing PBMC with K562-mbIL21 aAPC for 19 days, the culture contained more than 95% aNK cells. aNK cells were then cultured in RPMI-1640 with 10% exosome-free FBS, and on day 21, the medium was collected. Specifically in this report, we used batch 37, experiment A NK cell supernatant, referred to here on out as B37A NK cell supernatant. By running the 1 mL of B37A NK cell supernatant through the ÄKTA start system using a 30 mL bed volume Sephacryl S-200 column, we obtained a chromatogram using the equipped spectrophotometer detailing the protein concentration of the supernatant (Figure 2A). The protein concentration directly correlates to the presence of exosomes, as the exosomes are compact with these proteins. Two distinct peaks are achieved, which indicates that the resulting chromatogram is of high resolution. In order to further validate the resolution
and reproducibility of the ÄKTA start system, we ran two different cancer cell line supernatants, Sup-B15 and CHLA-255, through the same Sephacryl S-200 column (Figure 2B and 2C). Similar chromatograms are obtained, both with two distinct peaks. We also ran 4 mL of B37A NK cell supernatant through a 120 mL bed volume S-300 Sephacryl column (not shown) in order to achieve even higher resolution and more purified fractions used for later assays. Thus, this method (Figure 1) is able to isolate exosomes from various sources of cell culture medium. In order to begin characterizing these peaks and exosomes, we pooled certain fractions together and ran ELISA and Western blots to examine protein composition. Fractions were collected as soon as the ÄKTA start system detected UV absorbance, with each fraction containing 1 mL and 30 fractions in total.
6 | JOURNYS | FALL 2018
CD81, a common marker of exosomes, was used to locate the exosomes on the chromatogram (Figure 3A). CD81 was mostly found in peak 1 (Fractions #4-7) and little could be found in fractions after. In any case, the presence of CD81 confirms the presence of NK exosomes successfully isolated using the ÄKTA start system and size-exclusion column. Furthermore, because NK exosomes have been shown to harbor cytotoxic proteins such as perforin and granulysin, we probed for the presence of granzyme B using an ELISA (Figure 3B). For this experiment, we wanted to focus specifically on peak 1 because of the detection of exosomes only in peak 1 (Figure 3A) and so used samples obtained by running 4 mL of B37A NK cell supernatant through a 120 mL bed volume S-300 Sephacryl column. This ensured more distinct fractions as this procedure yielded four times the number of fractions as before, allowing for more coherent analysis of peak 1. We found the most granzyme B to be found in fractions #4-6, confirming the presence of cytotoxic exosomes in peak 1 and signifying where most cytotoxicity for the cancer cell lines may be found. Finally, we decided to look for CD56, also known as neural cell adhesion molecule (NCAM), using an ELISA with the same peak and sample as used for granzyme B. As CD56 is a marker characterizing different subtypes of NK cells, we wanted to test whether or not NK cell-derived exosomes may also be differentiated into different subtypes (Figure 3C). CD56 was found in most abundance in the shoulder and tail of peak 1 (Fractions #10-12, 19-21), possibly indicating the presence of a different subtype of NK exosome because exosomes were certainly present throughout the peak 1 fractions. To determine if the isolated exosomes were functional, we performed luciferase assays for cytotoxicity. We used CHLA255 neuroblastoma cells and Sup-B15 acute lymphoblastic leukaemia cells transfected with the firefly luciferase gene in order to able to quantify the survival of cancer cells. First, we collected fractions from the same batch used for our granzyme B and CD56 ELISAs to ensure distinct fractions as well as to have a means of comparison with Figures 3B and 3C. We then pooled our fractions of peak 1 together into groups of 3 (#1-3, 4-6, etc.) and concentrated 600-fold using
Amicon ultra centrifugal filters, a protocol we adopted as it yielded very purified exosomes and did not lose much sample. As CD81 was not found in peak 2 of our samples and only in peak 1, intact exosomes were solely present in peak 1. Thus, we decided to concentrate only on peak 1 in order to prevent the killing of cancer cells with free form cytotoxic proteins. After 24 hours of incubation with concentrated fractions of aNK cell-derived exosomes, survival of CHLA-255 and Sup-B15 cells significantly decreased in certain fractions (Figure 4A and 4B). Despite CHLA-255 being an adherent cell culture and Sup-B15 suspension, NK exosomes did not seem to kill with special selectivity towards different cell cultures, a possibility to be accounted for. Most cytotoxicity is observed in the shoulder of peak 1 (fractions #10-15) and so did not correlate with the fractions as described with the granzyme B ELISA, which may mean that functional, cytotoxic exosomes are only found in the shoulder of peak 1. We then wanted to test the uptake of NK exosomes with an array of inhibitors. We decided to use filipin, cytochalasin D, and 5-ethyl-N-isopropyl amiloride (EIPA), all of which have been shown to block the uptake of extracellular vesicles [1, 6]. Filipin is a cholesterol-depleting agent and is thus effective in blocking lipid-raft mediated endocytosis and caveolin-dependent endocytosis. Cytochalasin D is an inhibitor of actin polymerization and is consequently able to interfere with phagocytosis and other uptake pathways. EIPA is known to block macropinocytosis, another endocytic pathway that extracellular vesicles are able to utilize. Because these inhibitors have tendencies to kill the cells themselves at high doses, we performed many cytotoxicity assays in which we varied inhibitor concentrations. By doing so, we were able to determine the maximum concentration at which the cancer cells do not die solely due to the inhibitor. For our experiment, we incubated a concentrated combined pool of peak 1 sample with CHLA-255 cells and with either 10 nM filipin, 10 ng/mL cytochalasin D, or 50 uM EIPA (Figure 4C) for 24 hours. The cytotoxic exosomes killed significantly and yet none of the inhibitors were able to prevent the killing of cancer cells. These results demonstrate that killing may occur not just through uptake pathways.
7 | JOURNYS | FALL 2018
ART-spectrum of Oral Bacteria by Age and Inhibitory Effect of Herbals Extracts by Do Hun Kim and Dahun Ryu Art by Saeyeon Ju Abstract References Mulcahy LA, Pink RC, Carter DRF. Routes and mechanisms of extracellular vesicle uptake. Journal of Extracellular Vesicles. 2014;3(1):24641. doi:10.3402/jev.v3.24641. Li P, Kaslan M, Lee SH, Yao J, Gao Z. Progress in Exosome Isolation Techniques. Theranostics. 2017;7(3):789-804. doi:10.7150/ thno.18133. Raimondo F, Morosi L, Chinello C, Magni F, Pitto M. Advances in membranous vesicle and exosome proteomics improving biological understanding and biomarker discovery. Proteomics. 2011;11(4):709-720. doi:10.1002/pmic.201000422. Liu Y, Wu H-W, Sheard MA, et al. Growth and Activation of Natural Killer Cells Ex Vivo from Children with Neuroblastoma for Adoptive Cell Therapy. Clinical Cancer Research. 2013;19(8):2132-2143. doi:10.1158/1078-0432.ccr-12-1243. Jong AY, Wu C-H, Li J, et al. Large-scale isolation and cytotoxicity of extracellular vesicles derived from activated human natural killer cells. Journal of Extracellular Vesicles. 2017;6(1):1294368. doi:10.1080/20013078.2017.1294368. Escrevente C, Keller S, Altevogt P, Costa J. Interaction and uptake of exosomes by ovarian cancer cells. BMC Cancer. 2011;11(1). doi:10.1186/1471-2407-11-108. Colombo M, Raposo G, ThĂŠry C. Biogenesis, Secretion, and Intercellular Interactions of Exosomes and Other Extracellular Vesicles. Annual Review of Cell and Developmental Biology. 2014;30(1):255-289. doi:10.1146/annurev-cellbio-101512-122326. Fais S. NK cell-released exosomes. OncoImmunology. 2013;2(1). doi:10.4161/onci.22337. 8 | JOURNYS | FALL 2018
In this study, we determined the prevalence of bacteria resistant to certain antibiotics, including tetracycline, amoxicillin, vancomycin, and lincomycin, using bacteria cultures obtained from taking oral samples from people of various ages and with various oral health conditions. Nearly all samples (15 out of 16) carried one or more of the antibiotic resistant bacteria types. Among the four types of antibioticresistant bacteria studied, tetracycline-resistant bacteria were the most common. Resistant bacteria were more frequently found in those over the age of 35 compared to those under the age of 21. We also measured the antimicrobial activity of 12 natural
extracts on Streptococcus salivarius, Staphylococcus aureus, Streptococcus mutans, Neisseria meningitidis, and Escherichia coli. Of these natural extracts, magnolia vine was found to have the strongest antimicrobial activity, as it inhibited the growth of all five bacteria. Many of the other extracts, specifically sophora, pine leaves, licorice, lavender, oregano, artemisia, Asian lizardâ€™s tail, and turmeric, were also able to inhibit the growth of some bacteria. This study shows that these natural extracts could potentially be effective in suppressing the growth of these antibiotic resistant oral bacteria.
Introduction While recently participating in volunteer activities as oral education staff assistants, we wondered how effective dental education training is at reducing harmful oral bacteria. Thus, we investigated the types of germs that cause oral diseases and which species are present in saliva and interproximal crevicular surface area . We also wanted to know what kind of change in bacteria prevalence would be observed as a result of the experiment. In recent years, many active-ingredients in antibacterial toothpastes have been developed. Additionally, it has become a common medical practice to use antibiotic prescriptions to kill pathogens. However, all life forms undergo mutations as they
reproduce, and when we abuse antibiotics, we are unwittingly helping antibiotic resistant bacteria grow in our bodies as natural selection helps to indirectly breed these forms of bacteria . In order to test antibiotic resistance as a side effect of antibiotic over-usage, we attempted to culture the bacteria in an antibiotics-containing culture medium to find out how many antibiotic-resistant bacteria are in the mouths of people. Furthermore, if we could isolate antibiotic resistant bacteria from oral bacteria, we could also potentially find a natural substance that could inhibit the growth of bacteria without creating the issue of antibiotic resistance. Thus, we accordingly tested several natural substances for antibacterial activity. 9 | JOURNYS | FALL 2018
Materials and Methods
Identification of Oral Bacteria
As shown in Table 1, the saliva of 16 subjects was first diluted to 2% . 40 μl of the diluted saliva was sprayed onto agar medium and cultured. Then, the interproximal crevicular surface of each person’s teeth was scratched with a microbrush . The brush and 2 mL of sterilized distilled water were subsequently put into a conical tube, and the mixture was
Search for Oral Bacteria vigorously shaken for about 10 minutes. 30 μl of this was then sprayed onto agar medium and incubated. Characteristic individual colonies were transferred to fresh medium (agar) for culturing. These cultures were analyzed by Bionics, a biomedical corporation that identifies bacteria species using a 16S rRNA gene sequence .
Antimicrobial Activity of Natural Extract 2 grams of each of the natural materials were placed into 50-ml tubes, and a 5-fold volume of ethanol was added to each mass. Then, the antimicrobial components in each of these natural materials were extracted by using the supernatant from centrifugation. To heat treat each of the substances, tubes containing the natural products were placed in an 80 °C water
bath for at least 10 minutes. Each bacterial strain was cultured in a shaking-incubator for more than one day and spread on an agar medium. Afterwards, the natural material sample was slowly absorbed on a paper disc four times in 50 μl extract aliquots (200 μl total). The clear zone around the disc on the plate medium was measured to observe the antimicrobial activity.
Ten colonies that cultured from the seven individuals and five species of bacteria such as S. salivarius, S. aureus, K. planticola, S. parasanguinis, and S. mutans were identified. Some were revealed to be the same species. Two of the five identified bacteria, S. mutans and S. Aureus, cause cavities and angular cheilitis. K. planticola, a bacteria typically found in plant roots, was unexpectedly identified in individual samples. In adults suffering from poor periodontal conditions, the bacteria that caused periodontitis were not found. One possible explanation would be that culturing conditions were aerobic; since most of the germs causing periodontitis are anaerobic bacteria, their colonies may not have been able to grow.
Antibiotic Resistance Patterns in Children, Adolescents and Adult Oral Bacteria Samples of oral saliva and interproximal crevicular area were inoculated on a agar medium supplemented with both the common medium and the medium containing antibiotics (amoxicillin, vancomycin, tetracycline, and lincomycin) in order to observe how much antibiotic-resistant bacteria grew. Samples 1, 3, 4, 5, 6, 9, 11 and 12 were inoculated after 4 weeks of tooth
brushing. As shown in figures 2 to 5 below, large amounts of antibiotic-resistant bacteria survive even in the presence of excess antibiotics. At least one antibiotic-resistant bacteria was shown to grow in the majority of participants. However, one adolescent of 11 years old did not harbor any bacteria with antibiotic resistance to the four tested antibiotics.
Figure 2 Bacterial colonies grown in medium supplemented with Amoxicillin antibiotics
Figure 3 Bacterial colonies grown in media Figure 1 Natural materials used in the study 10 | JOURNYS | FALL 2018
supplemented with Vancomycin antibiotics
Figure 4 Bacterial colonies grown on media supplemented with Tetracycline antibiotics
11 | JOURNYS | FALL 2018
Samples of oral saliva and interproximal crevicular area were inoculated on a agar medium supplemented with both the common medium and the medium containing antibiotics (amoxicillin, vancomycin, tetracycline, and lincomycin) in order to observe how much antibiotic-resistant bacteria grew. Samples 1, 3, 4, 5, 6, 9, 11 and 12 were inoculated after 4 weeks of tooth brushing. As shown in figures 2 to 5 below, large amounts of antibiotic-resistant bacteria survive even in the presence of excess antibiotics. At least one antibiotic-resistant bacteria was shown to grow in the majority of participants. However, one adolescent of 11 years old did not harbor any bacteria with antibiotic resistance to the four tested antibiotics. A comparison of the results after 4 weeks of tooth brushing on 8 people showed that antibiotic-resistant bacteria disappeared after brushing in 7 out of 15 cases in which antibiotic-resistant bacteria were grown. In particular, three individuals with
amoxicillin resistant bacteria showed they had lost the resistance after 4 weeks of brushing. Three out of six patients with tetracycline and one out of six patients with lincomycin showed no antibiotic resistant activity. It is evidenced that adults may have more antibioticresistant bacteria in their mouths, which could be possible as they have more contact with antibiotics as they age . The data shown supports this explanation as the proportion of those showing antibiotic resistance were higher in adults over 35 years of age for three of the antibiotics, the exception being amoxicillin (figure 7). For example, Vancomycin had the greatest difference in antibiotic resistance when comparing those under 21 years old and those over 35 years old. In order to identify what species the antibiotic-resistant bacteria belong to, they were identified using a 16S rRNA gene sequence. The characteristics of these bacteria is summarized in Table 3.
Figure 6 Percentage of people with antibiotic-resistant
Figure 7 Percentage of people who were found to have antibiotic-resistant bacteria by age (A; amoxicillin, V; vancomycin, T; tetracycline, L, lincomycin)
bacteria growth among the total population of 16
Exploring natural substances that inhibit the growth of pathogenic oral bacteria Magnolia vine had antimicrobial activity able to inhibit survival of all five species of bacteria; specifically, it was most effective against N. meningitidis. Sophora had antimicrobial activity against all 5 of the bacteria except E. coli, and had the greatest effect on S. mutans and S. aureus. Like sophora, pine needle showed antibacterial activity against four bacterium, except E. coli., while licorice and lavender were only effective
against three bacteria. Oregano, artemisia, saururus, and circuma all had an inhibitory effects against only two types of bacteria  . The ginkgo tree extract was found to have only a slight effect on N. meningitidis, and had no effect on the other four species of bacteria. When these extracts were heat treated and their antimicrobial activity was measured, most antibacterial components of them were discovered to be relatively heat-resistant as well.
Conclusions S. salivarius, S. aureus, K. planticola, S. parasanguinis, and S. mutans were cultured after their identification as oral bacteria. Though common oral bacteria, tooth-decay-causing, and angular-cheilitis-causing bacteria were identified, no periodontitis causing bacteria were found. As a result of steady tooth brushing for 4 weeks, decreased interproximal crevicular bacterial colonies and/or smaller bacterial colonies were observed in multiple subjects. Bacteria was cultured from a total of 16 saliva or interproximal crevicular surface area samples. Among tetracycline, amoxicillin, vancomycin, and lincomycin antibiotics, tetracycline resistant bacteria were the most common. On the other hand, amoxicillin resistant bacteria were the least common. This is likely due to the fact that tetracycline inhibits protein synthesis in a wide available range, while amoxicillin inhibits cellular membranes in a limited available range. However, in both cases, as antibiotic resistance becomes more common, the need for alternative therapies, such as natural extracts, will increase. Of the participants, only one 11-year-old adolescent showed no presence of antibioticresistant bacteria; all others showed bacteria with resistance to one or more of the antibiotics. The findings also illustrated that participants in the age group of over 35 generally have more oral, antibiotic resistant bacteria in comparison to the participants under 21 years of age. This may be because as one’s age increases, so does their total exposure to antibiotics, making it more likely
for resistance to develop. Increased prevalence of antibiotic resistant bacteria is perhaps a result of the alarming overuse of antibiotics, which makes it increasingly difficult to treat diseasecausing bacteria with specific antibiotics as a person’s age increases. Using antibiotic resistant bacteria found in cultures, the antibacterial activity of several natural extracts were tested in search of new compounds that could potentially inhibit growth of pathogens. Four types of antibiotic resistant bacteria, S. salivarius, S. aureus, Streptococcus mutans, N. meningitidis, and Escherichia coli, were used as controls. The antimicrobial activity of ethanol extracts of oregano, lavender, artemisia, ginkgo, saururus, circuma, magnolia vine, sophora, houttuynia cordata thumb, pine needles, turmeric, licorice were then measured by the paper disc method. The magnolia vine had antimicrobial activity against all bacteria, but was the most effective against N. meningitidis. Ginkgo, on the other hand, had only a slight effect on N. meningitidis. As N. meningitidis and S. maltophilia can cause lung and brain disease, antibiotic resistance poses a major problem. Our results concluded that certain natural extracts, such as magnolia vine, can be effectively used to inhibit the growth of harmful oral bacteria. In addition, after 4 weeks of brushing, 8 subjects showed a disappearance of antibiotic-resistant bacteria. Thus, consistent tooth brushing can also help to eliminate antibiotic resistant bacteria.
References Jeong MJ, Kim IH, Choi CW, Associations between microbiological presence, pH, and buffer capacity of the saliva and caries in clinical survey. Oral Biol Res 2004; 28(2): 73-82. Carranza FA, Jr. Glickman’s Clinical periodontology 7th edition, 1990; 96:103, 203. Yella Hewings-Martin, What causes antibiotic resistance? Medical News Today. The Korean Medicine Times 2017.11.17. Kim DY, The Korean Medicine Times, 2017.11.17. Go HS, Kim SG, Hong DY. Research of Salix extracts’ effect on the growth of Streptococcus mutans. The 58th national science fair. 2012; No. 1419. Park SY, Yu DJ, Wi SM, Jeong JE, Cho SG. Research about antibiotic resistant bacterial isolation, identification and treatment in Wang suk stream water. The 56th national science fair. 2010; No.1831. Lee JR, Kim YS, Chang WS, Park OD, Lee YK, Antimicrobial Resistance of Staphylococcus Aureus Isolated from Korean Oral Cavity , Korean Oral Maxillofacial Pathology Society, 2011; 35(1):31- 36. Park JG, Research of herb effect in restriction of microorganismal growth. The 57th National science fair. 2011; No. 1418 Chae GH, Eo GS, Jeon YH, Hong JP, Antibacterial Activity of Artemisia Capillaris THUNB on Oral Bacteria, Journal of Oral Medicine and Pain. 2009; 34(.2):169-177. Wonkwang university Industry academic cooperation, A Composition of disease of oral cavity containing extract of circuma longa and opuntia humifusa, The Korean Intellectual Property Office, patent No.10-2014-0089728
12 | JOURNYS | FALL 2018
13 | JOURNYS | FALL 2018
Using Machine Learning to Play Video Games by Daniel Liu Art by Daniel Kim In recent years, the popularity of artificial neural networks (ANN) has been on the rise in the field of machine learning. Many problems that were considered difficult for a computer to solve, such as recognizing images and playing video games, are now tractable using machine learning techniques that train computers to perform tasks. An ANN, a mathematical model that accomplishes many of these feats, resembles the interconnected neurons in the human brain. Like many other machine learning algorithms, ANNs are imperfect, but they can improve their functionality as they learn by example. An interesting application of these unique properties of neural networks lies in a subfield of artificial intelligence, known as reinforcement learning. Reinforcement learning is very useful in controlling robots or other systems, and it also has financial applications. In this article, we explore how computers can learn to play video games by examining how ANNs work and how they can be applied to traditional methods of reinforcement learning. Neural networks learn to approximate continuous mathematical functions by composing simpler functions in a layered structure. Each layer contains a number of neurons, and each neuron is connected to the neurons in the previous layer and the next layer. These connections have weights, represented by numbers, that are modified while learning to help the network produce better results . In essence, each of these weights represents a miniature function that together make up larger functional layers. The layers are then stacked to create one gigantic neural network function. By using many neurons in different types of layers, neural networks can become very flexible in modeling multidimensional data. While training a neural network, the output of the network is compared to the correct results and the difference is used to deduce the adjustments for each of 14 | JOURNYS | FALL 2018
the network’s weights . This allows the ANN to minimize the difference between the output of the network and the target results, effectively internalizing the input data. The main goal of ANNs is to learn the general trend of the data, and it has been empirically proven to be capable in doing so for many problems. In our case of playing video games, the algorithm would need some way to describe the state of the environment, each action that can be taken at that state, the reward for taking a certain action, and how the future reward for taking each action affects our calculations. In many video games, the state can be the observable pixels on the screen, while the actions control movement and other aspects of the player. One significant difference between this system and other machine learning problems is that the correct choice is not known. Also, there are many factors within the game that are outside the player’s control. Thus, an algorithm that solves a reinforcement learning problem must observe the different states and deduce which action leads to the most reward . Fortunately, there is a well-known mathematical formula for solving such a difficult task, and it involves keeping track of the total future reward for attempting some action in a process called Q-learning: Q(s, a)=r+maxa’Q(s’, a’) In this equation, s represents the current state and r represents the reward gained from choosing the action a. s’ represents the next state after taking the current action a, and represents the falloff for future rewards. This equation is quite intuitive, as it shows how the maximum reward attainable in the future should be added to the current observable reward for each action at a certain state. The reason why it is expressed in terms of a recursive quality, or “Q” function, is so that the equation can be applied naturally to some sort of storage mechanism for saving the calculated Q(s, a), or the Q-values. Another interesting idea is that the importance of rewards too far into the future is depreciated by a discount factor, so the algorithm will focus on more immediate gains. By using the Q-learning equation to iteratively improve the estimates of the quality of each action, the algorithm can learn which actions lead to the maximum reward. Although traditional Q-learning provides an intuitive formula for solving reinforcement learning problems, or in our case, playing simple video games, the algorithm is not practical for slightly complicated games. The main problem is the huge memory requirement needed for calculating 15 | JOURNYS | FALL 2018
and saving the Q-values of all of the observable states and actions. In many games, the number of different states can be extremely large, which makes it nearly impossible to measure the quality of each state and action. The key insight to get around this limitation is that the Q function can be modeled using a neural network . Recall that neural networks can be trained to approximate continuous functions. To combine that with Q-learning, each state can be fed into a neural network and it can be trained to produce the Q-values for each action as the output. Researchers at Deepmind, a company now acquired by Google, used this technique to allow the same neural network architecture to play classic Atari 2600 games, like Pong and Space Invaders [5, 6]. Their network was able to perform considerably better than previous models and even achieved superhuman performance for a few games. The most intriguing part is that their network’s only input was the screen’s pixels. This allowed the same network to be trained for many different types of games. Of course, they also used more complicated ANN layers and incorporated other strategies such as reviewing past experiences. Later developments even allowed computers to play Doom, a 3D video game, using similar methods . The success of the Q-learning and neural network approach taken by Deepmind researchers proved the capabilities of neural networks and led to even more research that built upon those ideas. Some of these newer ideas can be evaluated through a simple game . The proposed game is very straightforward, as it only involves moving a few blocks on a grid. A few squares are spawned on the grid, with red squares being dangerous (with a reward of -1) and green squares being beneficial (with a reward of +1). The player can move a blue square in four directions: up, down, left, and right. Like many reinforcement learning problems, the goal is to maximize the reward. The program directly looks at the rendered pixels of the squares, and deciphers how to maximize the reward. It has to learn how the movement system works and recognize that red is bad and green is good. Figure 1 shows how the program
performs throughout the training. Note that the length of each episode (ep), or each playthrough of the game, lasts for 50 steps total, and we do not interrupt it. After slightly more than two hours of training, the program is able to consistently avoid the red squares and it actively seeks out green squares. This is because more advanced techniques, like automatically prioritize more important events while reviewing past experiences, were implemented along with other improvements to the network architecture that were based on research from the Deepmind paper on Q-learning and neural networks [9, 10]. Recent advances in reinforcement learning have been very fascinating in the world of video games, especially due to the combination of classic techniques with those found in more recent research on larger neural networks. Currently, these developments only allow neural networks to play different types of video games, but it ultimately shows the growth of research focusing on creating smarter artificial intelligence to solve realistic problems similar to video games.
References  Nielsen MA. Neural Networks and Deep Learning. Determination Press; 2015.  Mazur M. A Step by Step Backpropagation Example. https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example. Published November 21, 2017. Accessed June 27, 2018.  Juliani A. Simple Reinforcement Learning with Tensorflow Part 0: Q-Learning with Tables and Neural Networks. https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0. Published August 25, 2016. Accessed June 27, 2018.  Juliani A. Simple Reinforcement Learning with Tensorflow Part 4: Deep Q-Networks and Beyond. https://medium.com/@awjuliani/simple-reinforcement-learning-with-tensorflow-part-4-deep-q-networks-and-beyond8438a3e2b8df. Published September 2, 2016. Accessed June 27, 2018.  Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M. Playing Atari with Deep Reinforcement Learning. https:// arxiv.org/abs/1312.5602. Published December 19, 2013. Accessed June 27, 2018.  Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D. Human-level control through deep reinforcement learning. Nature. 2015;518(7540):529-533. doi:10.1038/nature14236.  Bhatti S, Desmaison A, Miksik O, Nardelli N, Siddharth N, Torr PHS. Playing Doom with SLAM-Augmented Deep Reinforcement Learning. https://arxiv.org/abs/1612.00380. Published December 1, 2016. Accessed June 27, 2018.  Liu D. DDDQN. https://github.com/Daniel-Liu-c0deb0t/General-Algorithms/tree/master/machine_learning/reinforcement_learning/dddqn. Accessed June 27, 2018.  van Hasselt H, Guez A, Silver D. Deep Reinforcement Learning with Double Q-learning. https://arxiv.org/abs/1509.06461. Published September 22, 2015. Accessed June 28, 2018.  Wang Z, Schaul T, Hessel M, van Hasselt H, Lanctot M, de Freitas N. Dueling Network Architectures for Deep Reinforcement Learning. https:// arxiv.org/abs/1511.06581. Published November 20, 2015. Accessed June 28, 2018.
16 | JOURNYS | FALL 2018
Evaluating the Effects of Binding Pocket Mutants in the PKA Regulatory Domain by Vainavi Viswanath, Canyon Crest Academy Benjamin Konecny, Poway High School art by Lesley Moon
Protein Kinase A (PKA) is a key signal transducing protein in the cyclic AMP second messenger system. Previously, it was discovered that mutations of alanine 211 to threonine (A211T) and aspartic acid (A211D) cause increased/decreased regulatory activity leading to acrodysostosis, a disease that causes growth delays and skeletal malformations, and Carney complex respectively. The structural and mechanistic rationale for the opposing behavior of these mutations remains unknown. Here, using computational, all-atom molecular dynamics, we studied the effects of mutants A211T and A211D on the conformation of the binding pocket and their effects on the protein’s function in regulating activity. We present comparisons of internal distances, RMSF, and dihedral measurements of the two mutations, shedding insight into how mutations of A211 perturb PKA regulatory activity.
Protein Kinase A (PKA) is an ubiquitous signaling enzyme involved in a variety of pathways [1,2]. The cAMP-dependent pathways are vital in the human body, and play a role in regulating many functions, such as increase in heart rate, cortisol secretion, and breakdown of glycogen and fat among many others. cAMP is also essential for the maintenance of memory in the brain, relaxation in the heart, and water absorption in the kidney. Protein Kinase A is composed of two regulatory subunits and two catalytic subunits which phosphorylate other proteins in kinase signalling cascades. When cyclic AMP binds to the binding site in the regulatory (R) subunit, the catalytic (C) subunit is released, as shown in Figure 1.
Figure 1: Cyclic AMP binding to the R subunit to activate the C subunit .
In the inactive form, this protein exists as a heterotetramer containing two regulatory (R) subunits and two catalytic (C) subunits. PKA is regulated by the second messenger cyclic adenosine monophosphate (cAMP), which cooperatively
activates the enzyme by binding to its regulatory subunits and thereby releasing PKA’s catalytic subunits. Recently, two mutations in the same residue of the regulatory subunit in Ala211, located in one of the cAMP binding sites, have been associated with distinct diseases and protein regulation . The A211D mutation leads to decreased PKA regulation and excess of activity, and is associated with Carney Complex (CNC), a rare hereditary disease which includes formation of endocrine tumors and cardiac myxomas [4,5,6]. Another mutation in the same residue to threonine (A211T), however, has the opposite effect, leading to increased PKA regulation and loss of activity, causing acrodysostosis. In this condition, among the many associated consequences are growth delays, skeletal malformations, learning disabilities and cell resistance to hormones . These mutations have not previously been studied with all-atom molecular dynamics, a method which can provide atomic-level insight into cAMP-binding site structure on the microsecond timescale. Internal distances, RMSF, and dihedral angle measurements were used to analyze the effects of the mutations on the protein’s binding pocket and regulatory functions from atomic-level structures obtained from all-atom molecular dynamics trajectories. Using internal distance measurements, spatial and K-means clustering, Principal Component Analysis, RMSF, and dihedral angle analysis methods, we found that the mutations A211D and A211T destabilize PKA.
Molecular Dynamics Simulations. Simulations were performed starting from the regulatory subunit structure in the crystallographic structure of the holoenzyme, PDB code 2QCS. We simulated three systems: wild type, A211D mutant, and A211T mutant. The systems were prepared on Schrodinger’s Maestro (version 10.4, Schrodinger, LLC, New York, NY), where they were protonated at pH 7.0 and the appropriate residues were mutated. The systems were parameterized with Amber 14SB force field and solvated in TIP3P water boxes with 150 mM NaCl. Simulations were performed using GPUaccelerated Amber 14 . For each system, four stages of minimization (proton only, solvent, solvent and side chains, and full system) were performed, followed by equilibration involving an initial heating to 100 K and heating to 310 K at constant pressure (1 bar). For each of the systems, we ran three independent simulations at 1 bar and 310 K with a time step of 2 fs, resulting in 400 ns of production for each run. Trajectorie 17 | JOURNYS | FALL 2018
in Jupyter Notebooks using PyEMMA . The principal component plots were obtained by creating two vectors based on the change of the internal distances over the trajectory. The K-means clustering results were plotted on the principle component plots. Root Mean Square Fluctuation Analysis. Trajectories were aligned according to RMSD of residue backbones. Per residue, excluding hydrogen, RMSF values were calculated using cpptraj . Dihedral Angle Analysis. To analyze the dihedral angles, we performed pytraj  and mdtraj  calculations and visualized the data on a Ramachandran plot. We also investigate dihedral angles further by using VMD and inspecting the residues neighboring the mutant residue and the binding pocket for possible conformational changes or fluctuations.
were visualized using VMD . The wildtype has 268 residues while the mutated systems have 291; the wildtype is missing 23 residues at the beginning of its sequence. Internal Distance Measurement. The distances between the alpha-carbon of the residues in the PKA binding pocket (residues 199-210) and the mutant residue (residue 211) were measured using cpptraj. A total of 78 measurements were made for every trajectory and there were three trajectories per system, adding to a total of 702 measurements. The distances were measured over the entire trajectory. The mean distances and the variance of the distances were calculated using the numpy package in Python and were graphed. Internal Distance Clustering and Principal Component Analysis. Spatial clustering was performed on the internal distances in Jupyter Notebooks using msmbuilder . The spatial cluster centroids were visualized in VMD. K-means clustering was performed separately on the internal distances
Figure 2: Standard deviation of internal distances for the wild type, A211T, and A211D systems
Figure 4: Aligned pdbs of of cluster centroids from Spatial Clustering for the three systems showing greater movement in mutants than the wild type. Wild type (cyan), A211T (red), and A211D (gray)
Figure 3: Mean internal distances for the wild type, A211T, and A211D systems over all 3 simulations.
wild type 18 | JOURNYS | FALL 2018
Internal Distance Measurement/ Clustering. The plot of the average standard deviations (Figure 2) of the internal distances over the trajectories for all three simulations showed that the mutant forms varied more than the wild type forms. In the graph, A211D and A211Tâ€™s standard deviations were greater than that of the wild type form. Interestingly, the mean internal distances of the wild type and the mutant forms were very similar. This was also supported in the clustering results, which revealed that the mutants fluctuated more than the wild type. The fluctuation was seen when the
cluster centroids were aligned in VMD (Figure 4). Principal Component Analysis Results. The principal component (PC) plots show that the mutated forms of PKA take on various configurations more often than the wild type. In the PC plot of the wild type, the energy minimum is the value indicated by dark blue region. This means that the protein occupies this PC space more often throughout the simulation. In the PC plots of A211D and A211T, the energy minimums are the values indicated by the light blue regions. The light blue is equally distributed in the two lobes of the PC plots for A211D and A211T.
Figure 5: Principal Component Plots for wild type, A211D, and A211T, respectively RMSF and Dihedral Angle Results. In our RMSF graph (Figure 6), we found that the wild type system had the lowest RMSF values throughout the simulations. For A211T, we found the system had higher RMSF values than the wild type, but a lower RMSF than A211D. The third system, A211D, had much higher RMSF values as compared to the other systems. In Figure 7, we analyzed the RMSF values of the specific residues over the course of the simulation. The first residue is the mutated one and we found that the wild type is the most stable of the three, followed by A211D, and finally A211T. The second residue that we looked at was the residue that rested on a neighbouring loop directly across from the mutated pocket. Our hypothesis was that the mutation itself might not show high amounts of fluctuation, but it could affect the structure and functions of those residues surrounding it. Upon further analysis of the mutated residue and the residues neighboring, we found that the mutated residue itself did not fluctuate abnormally but it caused a neighboring or adjacent residue G80, to have a higher RMSF values and fluctuate more. This could be due to the fact that G80 is located on a loop near the mutated residue, as shown in Figure 8.
Figure 6: Average RMSF of the three systems (wild type, A211T, and A211D) across all three simulations
Figure 7: Dihedral Angle analysis. The mutated residue, displayed on the graph to the left, shows little movement while a neighboring residue (GLY80) fluctuated much more, especially in the A211T mutation. 19 | JOURNYS | FALL 2018
Figure 8: The mutated residue (Threonine 122) and the residue (Glycine 80) directly across from it, resting on a neighboring loop.
From measuring internal distances of the alpha carbons of the amino acids in the binding pocket and the mutant and performing spatial clustering on the results, it can be concluded that the mutated forms of PKA are less stable than the wild type form (no mutations). The graph of the standard deviation and mean distances of the internal distances supports this. Since the internal distances of the mutated systems have a higher standard deviation than the wild type system’s internal distances, the mutated systems are less stable than the wild type. The Spatial Clustering results also support this conclusion. The top ten populated clusters’ centroids from the Spatial Clustering results were aligned using VMD. These structures are particular frames from the simulations that represent the whole cluster. The aligned structures of the wild type are very similar and there is not much variation seen in the image. However, in the aligned structures of the mutants, there is more variation among the different cluster centroids; they don’t align as perfectly as the wildtype. This shows that the mutants are much less stable with the most fluctuation in the Cyclic-AMP binding domain regions on the ends of the B/C helix. Next, the principal components of the internal distances shows that the mutations destabilize the protein. The dark blue region in the PC plot for the wild type indicates an energy minimum. This means that the protein assumes that conformation more often than any other conformation. The light blue regions for the A211D and A211T PC plots are the energy minimums and they are located in two general areas on the PC plot. This means that the protein assumes these conformations more often than other conformations. However, because the light blue region is positioned in various locations on the PC plot, it can be inferred that PKA in its mutated forms takes on more configurations than in its wild type form. Thus, the mutated forms are less stable. These conclusions are supported by the K-means clustering results that are graphed on the PC plot. In the PC plot for the wild type, all the pink dots-the top then most-populated cluster centroids- are in the same space as the energy minimum, supporting that the protein assumes a particular configuration more often than other configurations. In the PC plot for the mutants, the pink dots are spread further apart, supporting that the protein takes on various configurations in its mutated forms. 20 | JOURNYS | FALL 2018
After calculating the RMSF values for the three different systems, we can see that the mutated systems are much less stable than the wild type. In the graph of the RMSF for every residue in the system, the mutated systems consistently display a higher RMSF value than the wildtype. This indicates supports the previous observation that the mutated forms of PKA fluctuate more and are thus, less stable. Using dihedral angle calculations it can be seen that the residues around the mutated pocket behave differently. In the wild type, Glycine 80 is fairly stable while in the other two mutants the residue is much more flexible. This flexibility in Glycine 80 is due to its proximity to the mutated residue. These changes in RMSF and dihedral angle could contribute to conformational changes in the protein and a change in function. Using these various analysis methods, we found that the mutations A211D and A211T destabilize PKA and affect its activity. A211D results in decreased regulatory activity and A211T results in increased regulatory activity. These results would be beneficial in aiding drug design, Further investigation will be conducted to understand how the mutations destabilize PKA differently, resulting in two completely opposite effects.
1. Scherer MK, Trendelkamp-Schroer B, Paul F, et al. PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. J Chem Theory Comput. 2015;11(11):5525-5542. 2. Kim C, Cheng CY, Saldanha SA, Taylor SS. PKA-I holoenzyme structure reveals a mechanism for cAMP-dependent activation. Cell. 2007;130(6):1032-1043. 3. Horvath A, Bertherat J, Groussin L, et al. Mutations and polymorphisms in the gene encoding regulatory subunit type 1-alpha of protein kinase A (PRKAR1A): an update. Hum Mutat. 2010;31(4):369-379. 4. Greene EL, Horvath AD, Nesterova M, Giatzakis C, Bossis I, Stratakis CA. In vitro functional studies of naturally occurring pathogenic PRKAR1A mutations that are not subject to nonsense mRNA decay. Hum Mutat. 2008;29(5):633-639. 5. Roe DR, Cheatham TE 3rd. PTRAJ and CPPTRAJ: Software for Processing and Analysis of Molecular Dynamics Trajectory Data. J Chem Theory Comput. 2013;9(7):3084-3095. 6. PKA | Maillard Lab - Single Molecule Biophysics @ Georgetown. http://maillardlab.org/mechanisms-of-signaltransduction-in-pka/. Accessed June 26, 2018. 7. Linglart A, Menguy C, Couvineau A, et al. Recurrent PRKAR1A mutation in acrodysostosis with hormone resistance. N Engl J Med. 2011;364(23):2218-2226. 8. Case DA, Ben-Shalom IY, Brozell SR, et al. AMBER 2018. University of California, San Francisco. 9. Humphrey W, Dalke A, Schulten K. VMD: Visual molecular dynamics. J Mol Graph. 1996;14(1):33-38. 10. Harrigan MP, Sultan MM, Hernández CX, et al. MSMBuilder: Statistical Models for Biomolecular Dynamics. Biophys J. 2017;112(1):10-15. 11. McGibbon RT, Beauchamp KA, Harrigan MP, et al. MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories. Biophys J. 2015;109(8):1528-1532.
Investigating Stylometric Algorithms for Authorship Attribution Abstract
By:Pulkit Rampa//Art By: Lesley Moon
This experiment was performed to investigate the effectiveness of popular stylometric algorithms for authorship attribution. Identifying authorship has limitless applications. It can be used to detect plagiarism, to identify the true authors of historical documents, and to identify criminals who leave a forensic footprint. Three different stylometric algorithms were chosen for the levels of independent variable: MW Function Words, 4-character n-grams, and 2-Word n-grams. There was no control for this experiment because there is no single most popular stylometric algorithm and all three of these algorithms (along with variations of each) are widely used. The experiment was performed through JGAAP (Java Graphical Authorship Attribution Program) and all attribution was performed on the Reuter_50_50 database which was compiled with sample and test texts for stylometry and machine learning. The algorithms were tested for attribution accuracy and received a percentage score based on how many times the author was correctly identified. The data was then tabulated in Microsoft Excel, and the results showed that the 4-character n-grams algorithm had the highest attribution accuracy (44.04%). A t-test was performed which revealed that there was a statistically significant difference between the 4-character n-grams algorithm and the MW Function Words algorithm, and a statistically significant difference between the 4-character n-grams algorithm and the 2-Word n-grams algorithm. The results supported the research hypothesis, and further experimentation could be performed by investigating how the number of candidate authors impacts the attribution accuracy.
All authors develop a unique writing style through several years of working with the English language. Writers subconsciously prefer certain words or phrases in their work. For example, many writers often use the word “can’t” as a contraction between the two words “can” and “not.” Other writers prefer to fully write out “cannot,” while more traditional writers will write out “can not” . The field of stylometry involves taking advantage of these subconscious decisions for a u t h o r s h i p identif ication. Stylometric algorithms use machine learning techniques to analyze a set of texts with k n o w n
authors (referred to as sample texts), find patterns in the modes of writing, compare the patterns to an anonymous text (referred to as a test text), and output a predicted author with a degree of confidence. There are several stylometric algorithms which identify different types of patterns within sample texts. Three of the most popular methods involve analyzing the frequency of a series of 4 continuous characters (referred to as a 4-character n-gram sequence), a set of two words commonly found together (referred to as a word bigram), and repeated usage of functional words (as suggested by Mosteller and Wallace; this method is referred to as “MW function Words”). Authorship identification for anonymous texts has been an issue for centuries. “Questions of authorship are of interest to scholars, and in a much more practical sense to politicians, journalists and lawyers” . Accurate authorship attribution has limitless applications in forensics. Stylometric algorithms can be used to confirm an author who takes credit for someone else’s work. They can be used to identify plagiarized paragraphs in a writer’s work. For criminals who use the internet as a medium to communicate, their texts posted “anonymously” could be identified to the original author, and used as evidence against them in court. Finally, stylometric algorithms could be used to identify the original author of widely reprinted articles . “JGAAP (Java Graphical Authorship Attribution Program) is a Java-based, modular program for textual analysis, text categorization, and authorship attribution” developed at the University of Duquesne . The latest version of this program (JGAAP 6.0.0) supports 41 different stylometric algorithms and 19 different analysis methods to express the results of the attribution test. The program allows users to input sample texts, select canonicizers (to preprocess documents, ex. normalize white space or strip punctuation), and select stylometric algorithms for predicting authorship. JGAAP is open-sourced on GitHub and makes stylometry algorithms more accessible for non-professionals. The Reuter_50_50 is a data set compiled at UC Irvine which is commonly used for authorship identification and pattern recognition. The data set contains texts from the top 50 most published Reuters journalists. All the authors in the database were published under the subcategory of corporate news in order to minimize the topic factor . Each of the 50 authors has a total of 50 sample texts to train the algorithms and 50 test texts (non-overlapping with sample texts) to test the ability of the algorithms to attribute authorship. The independent variable in this experiment is the type of stylometric algorithm used. The first algorithm tested searches for patterns in the 50 most common instances of 4-gram character sequences. For example, for the word “misspelling,” 21 | JOURNYS | FALL 2018
the algorithm would identify “miss,” issp,” sspe,” spel,” pell,” elli,” llin,” and “ling” and choose the 50 most popular instances . This method has been proven effective through the work of Efstathios Stamatatos, Professor at the University of the Aegean, and his team’s work with character n-grams for stylometry. The second algorithm searches for word bigrams . This would find the 50 most common instances of two words next to each other; for example, “much more,” “much better,” and “much of.” The third algorithm reflects the work of Mosteller and Wallace in their authorship identification of the Federalist Papers. The algorithm finds the 50 most common instances of function words such as “by,” “of,” and “to.” The dependent variable in this experiment is the authorship attribution accuracy of each algorithm. This can be measured by dividing the number of times the algorithm correctly identifies the author of the text by the number of total trials. The WEKA-Linear Regression analysis method outputs the certainty with which it believes that a particular author is the true author of the test text (Figure 1). Figure 1: Example of JGAAP File Output Using WEKA-Linear Regression Analysis Technique (Raw
Data) If the author of the sample text matches with the author that the stylometric algorithm presented with the highest degree of confidence, then the algorithm has correctly identified the author. The purpose of this experiment is to test various popular stylometric algorithms and see how they perform given the same sample and test texts. The levels of the independent variable are the 4-character n-gram sequence, word bigram, and MW Function Word algorithms. There is no control for this experiment since there is no single most popular algorithm. Of the three levels of independent variable, it is believed the 4-character n-gram Sequence will yield the highest authorship attribution accuracy. This is a reasonable expectation as authors may subconsciously frequently use gerunds (such as words ending in “ing”), and there are several popular 4-character prefixes (such as “semi-” and “anti-”).
First, a computer running Windows 7 or later was used and JGAAP 6.0.0 was downloaded from the EVL Labs’ GitHub repository. Next, the official Reuter_50_50 database was downloaded from the UC Irvine machine learning database (available from: https://archive.ics.uci.edu/ml/datasets/ 22 | JOURNYS | FALL 2018
Reuter_50_50), and the file was extracted into the predeveloped folders “C50 – Test” and “C50 – Train.” Then, all other programs and applications running in the background of the computer were closed through Task Manager. JGAAP was opened and 50 sample texts from the ten authors were selected from the training folder, and the authors were placed in the “Known Authors” section of JGAAP. Next, the first ten test texts of each of the first ten authors (for a total of 100 test texts) were placed in the “Test Texts” section of JGAAP. Once all of the sample and test texts were set up, the “Canonicizers” tab was opened and the “Strip Punctuation” and “Normalize Whitespace” canonicizers were selected to more accurately isolate the author’s writing style. Then, the “Event Drivers” tab was opened and the “Character n-gram” algorithm, the “Word n-gram” algorithm, and “MW Function Words” algorithm were selected. Subsequently, n was set equal to 4 for the Character n-gram algorithm, and n was set equal to 2 for the Word n-gram algorithm. Next, the “WEKA-Linear Regression” analysis method was selected under the “Analysis Method” tab. Finally, the “Process” button under the “Review and Process” tab was selected. Once JGAAP had output the authorship predictions (Figure 1), an Excel spreadsheet was opened. A column was labeled for each of the three algorithms, and each algorithm’s results for the 100 test texts were compared with the actual author. The algorithm was tabulated such that it was given a 1 when the author was correctly identified , and a 0 when incorrectly identified (Figure 2). Figure 2: Example of Excel Tabulation for Converting Raw Data The sum was found for each algorithm and the final sum was the percentage correctly identified (since there were exactly 100 test texts). The final values for each trial should be tabulated on a separate Excel spreadsheet. The procedure was repeated for each of the first ten authors’ test texts (11-20, 21-30, 31-40, and 41-50). This entire procedure was repeated for all 50 authors (1-10, 11-20, 21-30, 31-40, and 41-50) for a total of 25 authorship attribution percentages for each algorithm.
The effect of stylometric algorithm on authorship attribution accuracy was studied and the analysis of the data is shown in Table 1.
Table 1: The Statistical Data Analysis of the Effect of Stylometric Algorithm on Authorship Attribution Accuracy (%) T h e hypothesis for the experiment was that if the 4-character n-gram algorithm was implemented, then it would result in the highest authorship attribution accuracy. After the data was collected, the mean was found for each level of the independent variable. The collected data showed that the MW Function Words algorithms resulted in the least average attribution accuracy (37.2%), and the 4-character n-gram algorithm resulted in the highest attribution accuracy (44.04%); the 2-word n-gram algorithm resulted in an accuracy of 38.8%. This suggested that on average, the 4-character n-grams algorithm had the highest performance, which implies that the research hypothesis was supported by the experiment. Upon closer analysis of the data, the 4-character n-grams algorithm also had the lowest standard deviation (3.867). This further supports the research hypothesis as it clearly shows that the algorithm is not only the most accurate, but also the most consistently accurate. Multiple t-tests were performed to determine the significance of the data at a p-value of 0.05 in which the degrees of freedom were 48. The null hypothesis was that there would be no significant difference between the effectiveness of all three stylometric algorithms. The calculated t-values for comparisons between MW Function Words and 4-character n-grams, and 4-character n-grams and 2-Word n-grams exceeded the critical t value of 2.011 for significance. This suggests that this data was significant, and that the data rejected the null hypothesis. This implies that the p-value for the comparisons were less than 0.05, which represents the probability of the results of the experiment being due to chance and not the independent variable. Because the calculated t-values were greater than the critical-t value, the data was significant and therefore the results were due to a change in the independent variable. The calculated t value for MW Function Words vs. 2-Word n-grams (0.989) was less than the critical-t value of 2.011 for significance. This implies that this particular set of data failed to reject the null hypothesis and there is no significant difference between the two levels of independent variable. The probability of the results being due to chance rather than a change in the independent variable was more than 5% which implies that the results may not have arisen
due to a change in the independent variable. In conclusion, the data for different stylometric algorithms having an effect on authorship attribution accuracy is statistically significant with the exception of MW Function Words vs. 2-Word n-grams.
This experiment was performed to identify which popular stylometric algorithm had the highest authorship attribution accuracy on a popular machine-learning database. A research hypothesis was formed stating that the 4-character n-grams algorithm would result in the highest attribution accuracy of the three algorithms tested. The results of the experiment supported the research hypothesis. Other researchers have worked with stylometric algorithms for authorship identification. For example, a researcher at the University of Duquesne was able to identify the book “Cuckoo” as written by J.K Rowling even though she wrote the book under a pseudonym. Other researchers from Stanford have tried investigating how the size of the sample text database impacts the overall attribution accuracy . When analyzing the overall process in stylometry, there are many factors to consider. For example, in this experiment only the 50 most common instances of patterns were used. However, this can be changed so that any number of the most common instances can be used in pattern analysis between sample and test texts. There are several opportunities for further research in this field. Firstly, the experiment could be conducted using a different database of anonymous forum texts or tweets instead of a predetermined news database. This could be useful for criminal identification or dark web forum poster identification. More experimentation could be performed to see how the number of potential authors effects the attribution accuracy. In this experiment, there are ten authors. However, realistically, an investigator may be able to narrow down their candidate list to fewer people. The fewer number of authors would certainly result in an increase in attribution accuracy . Finally, further experimentation could be performed on the importance of the sample text size. In this experiment, since there was a predetermined database, there was a limited number of 50 sample texts for each author. However, if the program could analyze thousands of an author’s previous texts, then the attribution accuracy would increase.
 Boukhaled MA, Ganascia J-G. Using Function Words for Authorship Attribution: Bag-Of-Words vs. Sequential Rules. Natural Language Processing and Cognitive Science. September 2015:115-122. doi:10.1515/9781501501289.115.  Ali N. Text stylometry for chat bot identification and intelligence estimation. Electronic Theses and Dissertations. May 2014. doi:10.18297/etd/31.  Eder M. Does size matter? Authorship attribution, small samples, big problem. Digital Scholarship in the Humanities. 2013;30(2):167-182. doi:10.1093/llc/fqt066.  Juola P, Vescovi D. Analyzing Stylometric Approaches to Author Obfuscation. Advances in Digital Forensics VII IFIP Advances in Information and Communication Technology. 2011:115-125. doi:10.1007/978-3-642-24212-0_9.  Rybicki J, Eder M. Deeper Delta across genres and languages: do we really need the most frequent words? Literary and Linguistic Computing. 2011;26(3):315-321. doi:10.1093/llc/fqr031.  Stamatatos E. Author identification: Using text sampling to handle the class imbalance problem. Information Processing & Management. 2008;44(2):790-799. doi:10.1016/j.ipm.2007.05.012.  Zhioua S. Tor traffic analysis using Hidden Markov Models. Security and Communication Networks. 2012;6(9):1075-1086. doi:10.1002/sec.669.
23 | JOURNYS | FALL 2018
The Antithetical Effects of Hard Water and Tsingtao Beer on Hair By Dahun Ryu, Dohun Kim, and Edward Lee Art by Anna Jeong
ABSTRACT In recent years, travelers in China have been raising concerns regarding the perceived damage and rough texture of their hair resulting from their stays in the country. The use of hard water is commonly regarded as the most plausible explanation for this phenomenon. In this study, we conducted 10 water quality analyses and observed hair cuticle and scalp cells under a microscope in order to objectively examine this common perspective. Our results indicate that tap water from Qingdao, a coastal city of China, contains a high level of lime due to the lack of efficient processing, negatively affecting hair and scalp cells. Two types of treatments were proposed: utilizing the active ingredients of the sweet flag and non-alcoholic beer yeast. These prescriptions were then applied to three volunteers for three days. From the data that was collected, we concluded that the hair and scalp damage caused by Chinese tap water can be improved with the implementation of the sweet flag and beer treatments.
INTRODUCTION Lime water, more commonly known as hard water, is a big issue in the world today. With its high mineral content, hard water plays a key role in damaging our hair’s intensity. According to WHO, the high concentration of Mg2+ and Ca2+ ions in Qingdao tap water classifies it as hard water. Our recent survey found that 38 out of 55 foreign residents living in Qingdao felt that the poor water quality was their biggest headache. Among these foreigners, 21 people said that their hair suffered damage when they moved to Qingdao due to the use of hard water. There are several methods to reduce the hardness of water, the most common being the ion exchanging method. However, there is a drawback to this method: refilling fuels such as Na2CO3 is quite expensive. In this study, we focused on two cheaper substances: beer and Acorus calamus, also known as sweet flag. Both have been used for treating hair in many East Asian countries because of their evident beneficial effect on hair [2,3,4]. The malt and hops in beer are rich in proteins that help repair damaged hair, while sucrose and maltose sugars help create shine. Comparably, the extract of sweet flag roots is one of the major constituents in the formulation of hair care products . Therefore, we decided to examine their effects on hair. 24 | JOURNYS | FALL 2018
MATERIALS AND METHODS 1. Analysis of Korean and Chinese tap water Fifty milliliters of tap water were randomly sampled in South Korea and China. Then, an EZ water monitoring kit set was used for the basic water analysis to measure six kinds of characteristics: COD (chemical oxygen demand), PO43+ (phosphate), NO3- (nitrate), NO2- (nitrite), NH4+ (ammonium), DO (dissolved oxygen), and pH (potential of hydrogen ion concentration). The concentration of heavy metal and chlorine was also measured by using a B50 kit.
2. Making artificial hard water
First, 0.592 g of Ca(OH)2 was dissolved in 4.9 L of distilled water. Then, the solution was neutralized by slowly adding aqueous HCl to pH 7. Lastly, distilled water was added again to fill the solution up to exactly 5.0 L. The process was repeated to make artificial “Korean” tap water. These simulations were made using calculations of mean water hardness from samples of both Korean and Chinese tap water.
3. Observation of hair damage and CCD-986 cells after exposure to Chinese and Korean tap water Three strands of black hair were plucked from three teenage volunteers. Then, each hair strand was submerged in either Chinese tap water, Korean tap water, or distilled water for 24 hours. The cuticles of the hair strands were observed with an OLYMPUS BX41 Research Upright Microscope in phase 200X and in phase 400X for the observation of the middle and tip sections of the hair. The same process was repeated with white colored hairs.
4. Effects of hard water on CCD-986 cells CCD-986sk cells, a type of fibroblast human cell with KCLB No. 21947 which we assume to be related to hair quality, were bought from the Korean Cell Line Bank. They were grown and maintained for 7 days at 37 °C with a culture medium that consists of 1% (Fetal Bovine Serum (FBS) and DMEM. First, the cells were intentionally detached from the culture medium. Randomly selected, 2e6/ml of the cultured cells were equally distributed into
five solutions of 1800 mL. Then 100μL of a solution with the same Ca2+ concentration as Korean tap water and 100μL of distilled water were added into the first cell solution. Each 100μL became 5% of the total cell solution. This process was replicated with Chinese tap water. This solution was replicated, and sweet flag extraction and alcohol free beer yeast were added in each solution. Finally, the control group was made by adding distilled water to the cells. These five cells were incubated again at 37 °C for 1 hour. Five random area samples of microscopic pictures were adjusted for 20 μL scale for each of the cell solutions. Under the same phase, the number of cells within the area of each microscopic picture was counted, recorded, and compared to estimate the differences in characteristics of five different cell solutions. Moreover, in order to observe the cells’ conditions, they were stained with methylene blue. Then, the shapes of the cells were compared through the microscopic pictures.
5. The effects of the beer extract treatment and sweet flag treatment on hair exposed to Chinese hard water. Three teenage male volunteers participated as subjects, and they washed their hair as instructed below. Subject 1 acted as the control group, whose hair was washed only with Korean tap water for six days. Subjects 2 & 3 utilized the extract treatment.
RESULTS AND DISCUSSIONS 1. Water analysis for tap water A. Basic water analysis An analysis of basic water testing records conducted by taking samples of tap water from Korea and China showed a clear difference between tap water in Korea and China. In most cases, the Chinese tap water quality was worse than that of the Korean tap water; however, the chlorine concentration was similar. Chlorine is an ingredient used to disinfect tap water. Through chemical reactions, it affects the physical and electrical properties of hair, making it stiff . However, since the amount of chlorine found in Chinese tap water is relatively small, it is hard to conclude that chlorine is the plausible cause of hair damage.
B. Analysis of water hardness The hardness of water was measured indirectly to determine the lime content in tap water. Based on the measured EDTA titration, the concentration of Ca2+ (aq) ions in Korean tap water was 60.54 ppm and the concentration of Ca2+ (aq) ions in Chinese tap water was 320.00 ppm. Chinese tap water has 5.3 times higher levels than Korean tap water, which could be a major cause of hair damage caused by Chinese tap water.
2. The change in hair structure caused by tap water In order to test the effect of hard water on hair elasticity, we put the black hairs in three types of water: Chinese tap water, Korean tap water, and distilled water. The hairs were dried and observed after 24 hours. As shown in Figure 1, distilled water has the least effect on the hair surface. In the case of Korean tap water, it is similar to the distilled water; the hair cuticles were preserved well, but not as well as those of distilled water. However, the surface of the hair soaked in Chinese tap water appeared to be rough. This is because each hair shaft is made up of little scales, like shingles on a roof. Chinese tap water tends to make the scales stand up, which makes hair rough and tangly. Since the cuticles of the white hairs are more visible, the experiment used white colored hairs for further comparison between the control group and hair exposed to Chinese tap water. As shown in Figure 2, the hair cuticles of the control group are highly concentrated rather than Chinese sample which means cuticles of Chinese sample are split. The results above show that Chinese tap water, which has high concentration of Ca2+, actually has damage on hair.
Figure 1: Hair after exposure of Distilled water, Korean tap water and Chinese tap water
Table 1: Results of Water Analysis Key: COD : chemical oxygen demand, DO: dissolved oxygen, NH4+: ammonium, NO2-: nitrite, NO3-: nitrate, PO4 3-: phosphate, pH: potential hydrogen
Figure 2: white colored hair after exposure of distilled water and Chinese tap waterand Chinese tap water
25 | JOURNYS | FALL 2018
Figure 5: Cell exposed to media contained C_TW
Figure 7: Comparing the number of normal cells between the control group, K_TW and C_TW
B. The changes in cells in the hard water culture fluid with beer and sweet flag After staining the cell with methylene blue, they were observed under the x200 microscope, using the x400 lens for specific cells. First, the comparison between the control group and C_TW group showed significant differences in both quantity and size. The control group was larger in both characteristics.After using the treatments, we found that both Tsingtao beer and sweet flag had a positive effect on cells. We conclude that both sweet flag and Tsingtao beer can protect the scalp cells from hard water, with Tsingtao beer showing a more apparent positive effect on cells. Table 3 shows the number of normal cells and abnormal cells in 4 groups, and each group has 4 samples. It also indicates the mean value of individual samples. According to Figure 9, the number of cells for Chinese tap water was only 1.3. The number of cells for the other two treatmentsâ€’ tap water with sweet flag and tap water with Tsingtao beer were almost double the amount of cells. These results indicate that Tsingtao beer is the best treatment.
Figure 3: Cell exposed to control media
3. Effects of hard water on CCD-986 cell and effects of beer and sweet flag A. Changes in cells due to exposure to hard water As shown in Figure 3, there were many cells that were exposed to distilled water (pointed out by the yellow arrows). Moreover, the size of the cells is considerably large, as pointed out by a red arrow. And it has a few abnormal cells, as pointed out by the blue arrow. As shown in Figure 4, the hair stands exposed to Korean tap water consisted of more abnormal cells. However, there were similar amounts of cells when compared to the control group. As shown in Figure 5, there was the fewest number of normal cells when compared to previous samples, and the cells and the abnormal cell is not observed in the Chinese tap water sample. Figure 11 shows the normal and abnormal cells (CCD-986SK) that exist in distilled water, Korean tap water, and Chinese tap water. Both the control group and K_TW group had roughly 6 cells in the cell culture media, but the standard deviation of K_ TW was slightly higher. However, in Chinese tap water, only an average of 1.3 normal cells were measured, down by about 78.3% from the number of cells exposed to distilled water and Korean tap water. Figure 6 shows the normal and abnormal cells (CCD-986SK) that exist in distilled water, Korean tap water, and Chinese tap water. Both the control group and K_TW group had roughly 6 cells in the cell culture media, but the standard deviation of K_ TW was slightly higher. However, in Chinese tap water, only an average of 1.3 normal cells were measured, down by about 78.3% from the number of cells exposed to distilled water and Korean tap water. The results show that Chinese tap water can have a negative effect on hair, roots, and scalp cells. 26 | JOURNYS | FALL 2018
Table 2: Number of normal cells and abnormal cells in the control group, K_TW and C_TW
Figure 8: Staining cells
Table 3: The number of normal cells and abnormal cells in the control, C_TW, C_TW with sweet flag, and C_TW with Tsingtao beer groups.
Figure 6: Comparing the amount of cells between the control group, K_TW and C_TW
Figure 4: Cell exposed to media contained K_TW
Figure 9: Comparison of normal cells between four groups
Figure 10: Proportion of normal cells to abnormal cells in three groups
27 | JOURNYS | FALL 2018
treatment. The above observation suggests that sweet flag is an effective cure for hair cuticles.
Figure 10 shows the percentage of normal and abnormal cells per group. Although there were more sweet flag cells than Tsingtao beer cells, the percentage of normal cells in Tsingtao beer was larger. Therefore, Tsingtao beer is more effective than sweet flag.
C. Changes of hair condition when hair get exposed to Tsingtao beer treatment after being exposed to hard water for 3 days As shown in Figure 13, there was a significant improvement in the sturdiness of the hair. On the sixth day, the improvement was much more apparent; the hair strand was smoother than before, having no cracks at all, and the follicle also showed improvement. At first, the follicle was spread, but then it returned to its initial state after the treatments.
4. Effects of hard water on hair exposed to beer and sweet flag A. Changes in the condition of hair exposed to tap water in Korea for 6 days As shown in Figure 11, there were no significant changes observed in the hair follicles; however, there was a subtle change in the midsection of the hair. The hairs turned slightly rough during the last 2 days.
In this study, the hardness of Chinese tap water was observed to be 320 ppm, while Korean tap water was measured to be only 60.54 ppm. Chinese tap water proves to be detrimental for washing hair due to higher levels of calcium and magnesium ions that make up this “hard” water. The application of sweet flag had a positive effect on the viability of the hair cells. In comparison to the control group, the treatment group had a profound effect on the proliferation of cells. Moreover, the application of Tsingtao beer has an even more positive effect on cells than the sweet flag does. Cells were best preserved when treated with Tsingtao beer. Ultimately, it can be determined that Tsingtao beer has a significant positive effect on hair.
Figure 12: Hair strands that were exposed to hard water for 6 days, then to sweet flag for 3 days.
cracked. However, after the sweet flag treatment, a considerable improvement was observed in hair cuticles. The crack was not observed from the fourth day onwards. Another drastic improvement was observed in the midsection of the hair; the hair with the treatment was much softer than hair without the Figure 13: Hair strands exposed to hard water for 6 days, then to Tsingtao beer for 3 days.
REFERENCES Figure 11: 6 hair strands under the microscope
B. Changes in hair condition after sweet flag exposure, following 3 days of exposure to hard water. Figure 12 depicts hair treated with hard water and sweet flag. During the first three days, the hair showed a significant change; the hair cuticles were rougher than the those of control group. Moreover, the tip of the hair suffered severe damage, specifically on the second day, when the tip of the hair was observed to be 28 | JOURNYS | FALL 2018
1) “Water Softening by Combination of Ultrasound and Ion Exchange.” Egyptian Journal of Medical Human Genetics, Elsevier, 15 Oct. 2008 2) Petras R. Venskutonis & Audrone Dagilyte (2003) Composition of Essential Oil of Sweet Flag (Acorus calamus L.) Leaves at Different Growing Phases, Journal of Essential Oil Research, 15:5, 313-318 4)Supannee Sripanyakorn, Ravin Jugdaohsingh, Hazel Elliott, The silicon content of beer and its bioavailability in healthy volunteers, British Journal of Nutrition (2004), 91, 403–409 5)Sang-Oh Park, Byung-Sung Park, Ga-Yeong Noh, Effect of natural plant extract (Abelmo) on action mechanism and hair growth activities in C57BL/6 mice, Journal of the Korean Oil Chemists’ Society Vol.31 No.4 6) Suwalski, Marianne. “Effects of Chlorine on Human Hair.” Effects of Chlorine on Human Hair, DesalinatedWater.info Wanstrow, Somerset BA4 4SU, England 29 | JOURNYS | FALL 2018
CAR-T Cell Immunotherapy Against Cancers By Clyde Xu//Art by Saeyeon Ju Abstract
CAR-T cell therapy is a new type of cell-based cancer immunotherapy. CAR is a recombinant receptor which can recognize a tumor cell’s surface antigen and provide T cell activation signals. As for the basic structure of CAR, a wide range of CARs have been developed. Currently, approximately 200 CAR-T studies are registered on clinicaltrials.gov. Above all, three leading CAR-T cell products have received FDA “Breakthrough Therapy” designation: Tisagenlecleucel (Novartis/ELIANA) for relapsed/refractory B cell acute lymphoblastic leukemia (B-ALL), JCAR015 (Juno/ROCKET) for B-ALL, and axicabtagene ciloleucel (Kite/Zuma01) for relapsed/refractory diffuse large B cell lymphoma (DLBCL). In August of 2017, the FDA approved the first CAR-T therapy—tisagenlecleucel (Kymriah, CTL019; Novartis). So far, the FDA has approved Tisagenlecleucel and axicabtagene
ciloleucel, but not JCAR015, in the United States. The approval represents a historic development in modern cancer treatment. Tisagenlecleucel and axicabtagene ciloleucel are the genetically modified CAR-T form of the patient’s own T cells, reprogrammed to target CD19+ B cancer cells. Both products have striking clinical effects against relapsed and/or refractory B-ALL. However, in spite of its promising clinical results, the toxicity is still a major concern for all CAR-T therapies. The most prevalent toxicities within a few days after infusion of CAR-T cells are cytokine release syndrome (CRS) and the destruction of healthy tissues, which are both life-threatening. Although CAR-T therapy has been highly successful, several obstacles such as tumor-target antigen selection, toxicity control, and effects against solid tumor still need to be overcome in the future.
What is the CAR-T cell therapy?
In 2010, Emily Whitehead was a 5-year-old diagnosed with acute lymphoblastic leukemia (ALL). Emily had been receiving chemotherapy for sixteen months; but even so, she still experienced a relapse. That’s when her parents learned about a new therapy known as chimeric antigen receptor therapy (CAR-T therapy), developed by scientists from the University of Pennsylvania. Emily was the first child to receive CAR-T therapy, and now, she is a thriving 13-yearold . In 2014, the US FDA granted “Breakthrough Therapy” designation (BTD) to Novartis Pharmaceutical’s CAR-T therapy CTL019 for treatment of ALL . This is the first CAR-T therapy to receive the FDA’s BTD. In August of 2017, the FDA approved the first CAR-T therapy—tisagenlecleucel. The approval represents a historic development in modern cancer treatment. Like all technology, CAR-T technology has experienced a long process of evolution. The first generation of CAR-T is capable of providing the signal needed for T cell activation, but its anti-tumor activity is limited in vivo. Unlike the first generation, second generation of CAR-T adds a new costimulatory signal and can increase the CAR-T-mediated killing effect. The third generation of CAR-T has proved to be successful in the clinic. This article focuses on the basic concept, principles, clinical application, and challenges of CAR-T-cell therapy in cancers. 30 | JOURNYS | FALL 2018
The major challenge for cancer immunotherapy is to induce an immune response against tumor cells. Effective tumor immunity requires immune recognition of tumor antigen and tumor eradication by immune effector cells. “Tumor cells express the human leukocyte antigens (HLA) of the major histocompatability complex (MHC) which can restrict the endogenous T-cell receptor (TCR) to recognize tumor cells.” Therefore, T cell immunotherapy is limited by the MHC restriction of TCR-mediated antigen recognition . CARs are recombinant T-receptors or artificial T cell receptors whose extracellular domain is an antibody-derived single chain variable fragment (scFv) that recognizes a tumor cell surface antigen. The scFv is linked to an intracellular signaling domain including the TCR CD3ζ subunit. Since the signaling via CD3ζ is not sufficient for optimal T cell function, CARs also include the intracellular signaling domain of a costimulatory receptor, such as CD28 or 4-1BB, which provides additional signals to optimize function. A basic structure of CAR consists of four domains: (i) an extracellular antigen-binding domain, a single polypeptide chain with an antibody-derived single chain fragment (scFv); (ii) an extracellular spacer domain, interposed between the extracellular binding domain and the transmembrane domain, which seems to be mandatory for functional expression of CD3ζ signalling; (iii) a transmembrane domain that mediates signal transduction within cars; (iv) an intracellular signalling domain that is responsible for optimizing the functions of engineered T cells .
transmembrane domain that mediates signal transduction within cars; (iv) an intracellular signalling domain that is responsible for optimizing the functions of engineered T cells . Chimeric antigen receptors can trigger T-cell activation in a manner similar to that of T-cell receptors. When endogenous T cells are genetically modified with a chimeric antigen receptor, the CAR-T cells target the antigens through an antibodyderived binding domain in the extracellular domain, inducing T cell activation via the intracellular signaling domains. Therefore, the CAR-T cells have several advantages over the endogenous T cell receptors, including, the redirection toward a broad variety of targets, and the combination of signaling domains for T cell activation. More importantly, unlike the endogenous TCR, CARs do not require antigen processing and presentation by NHC so that CAR-mediated antigen recognition is not MHC-restricted . Upon the basic structure of CAR, a wide range of CARs have been developed. Since CAR design must reconcile antigen recognition with signal transduction-functions, the firstgeneration CAR-T cells include both an extracellular binding domain and an intracellular T cell activation domain by transducing both a CAR and a CD3ζ T cell activation domain. The extracellular binding domain has antibody-recognizing tumor antigens, which avoid the MHC restriction and allow the CAR-T cells to express receptors specific for different tumor antigens. The first-generation CAR design contains only one intracellular T cell signaling component: CD3ζ. Although CD3ζ alone is sufficient to induce T cell cytotoxicity, the activation is not enough for long-term in vivo effect. In short, the first-generation CAR-T cells have shown some ability to induce anti-tumor effect, but the anti-tumor activity and
persistence are sub-optimal . The persistence of CAR-T-cells is critical to durable responses. In the absence of a costimulatory domain in the CAR, the signaling through first generation CARs is insufficient to induce CAR-T-cell proliferation. In order to overcome the disadvantage of the first-generation CAR and sustain the long term immune response, second generation CAR-T cells were developed. Their primary TCR signal is modified with the addition of intracellular co-stimulatory signalling domain such as CD28. CD3ζ activation combined with CD28 costimulation supports T cell activation, tumor killing, cytokine production and long-term persistence. The second-generation CARs greatly improved the in vivo function of CAR-T cells, and they have proven an important progress in the development of clinically efficacious T cell therapy . Since the combination of primary and costimulatory signals sustain distinct T cell functions, investigators have also designed next-generation CARs to further improve the target selection, toxicities and immune escape of CAR-T cell therapy. For instance, the third–generation CAR design includes CD3ζ paired to two costimulatory domains such as both CD28 and other co-stimulatory molecules . The design strategy of the fourth generation of CARs is to improve specifically in redirecting T cells towards target cells.
Manufacturing of CAR-T cells
Current approaches for CAR-T cell generation are required to harvest T cells from a patient and to activate, modify genetically and amplify CAR-T cells for reinfusion back into a patient (Figure 1) . First, a patient’s peripheral blood mononuclear cells (PBMCs) are collected through leukapheresis. Once PBMCs have been isolated, the cells are transported to manufacturing facilities for ex vivo activation. The activation process is independent of antigen presentation. Usually, the culturing T cells are activated with beads coated with CD3/CD28 antibody fragments, along with IL-2 . The amplification or expansion is another important process. The amplification is required to generate enough number of T cells available for infusion to the patient. Several platforms can be used to expand CAR-T cells. Currently, CD3/CD28 antibody-functionalized beads are used to expand T cell populations. Expansion can be performed before or after gene transduction.
Figure 1: Ex vivo CAR-T generation (cited and modified from Olweus J. Manufacture of CAR-T cells in the body. Nat Biotechnol. 2017.) 31 | JOURNYS | FALL 2018
CARs can be transferred into primary human T cells ex vivo using viral or nonviral approaches. Since viral vectors insert transgenes randomly into the genome, there is a risk of gene insertional oncogenesis. Recently, non-viral transfection techniques have shown promise in ameliorating
some of the drawbacks associated with viral vectors. Nonviral vector approaches use transposons, such as the sleeping beauty  and piggyBac transposon systems, and recent CRISPR/Cas9 technology . These approaches have been used to produce CAR-T cells.
Clinical application of CAR-T cell therapy At present, approximately 200 CAR-T studies are registered on clinicaltrials.gov. Three leading CAR-T cell products received FDA “Breakthrough Therapy” designation, including Tisagenlecleucel (Novartis/ELIANA) for relapsed/refractory B cell acute lymphoblastic leukemia (R/R B-ALL), JCAR015 (Juno/ROCKET) for R/R B-ALL and axicabtagene ciloleucel (Kite/Zuma01) for relapsed/refractory diffuse large B cell lymphoma (R/R DLBCL). Up to now, the FDA has approved Tisagenlecleucel and axicabtagene ciloleucel , and the clinical trial of JCAR015 has been stopped because of the toxicity . Tisagenlecleucel (CTL019) was the first FDA-approved CAR-T-cell therapy in the United States. On August 30, 2017, the FDA approved tisagenlecleucel for the treatment of patients of age 25 or younger with relapsed and/or refractory B-cell acute lymphoblastic leukaemia (B-ALL). The patient’s own T cells undergo genetic transduction to express CARs targeting CD19. The engineered CAR-T cells are subsequently infused back into the patient. According to FDA, tisagenlecleucel has striking clinical effects against relapsed and/or refractory B-ALL, and the overall remission rate among 63 responseevaluable patients who received this therapy in the multicentre phase II trial was 83% . Axicabtagene ciloleucel was the second FDA-approved CAR-T-cell therapy for blood cancers in October 2017. This product is indicated for the treatment of adult diffuse large B-cell lymphoma . Axicabtagene ciloleucel comprises of an extracellular scFv specific for CD19 and the signaling domains, CD3ζ and CD28, so that they can bind to and destroy CD19+ cancer cells. The
efficacy and safety of axicabtagene ciloleucel were established in a multicenter phase 2 clinical trial of 101 patients with DLBCL. The complete remission rate (CR) for DLBCL was 54% at 6 months of follow-up . Both Tisagenlecleucel and axicabtagene ciloleucel represent second generation CAR-T products. They involve the process of genetically modifying a patient's own T cells to express CARs that target CD19 on the surface of the cancer cells. Although they work in similar way, they have different indications. Since the expression of CD19 is restricted to the B cell lineage, CD19targeted CAR-T cells have shown promise in B-cell neoplasms. The anti-CD19 chimeric antigen receptor (CART19) appears to be an attractive target against B lymphomas  .
Challenges to CAR safety
life-threatening symptoms . Although cytokine blockades such as interleukin-6 antibody tocilizumab were successfully used to ameliorate CRS, more fundamental solutions, such as short-lived CAR-T-cells or split CAR-T-cell dosing, are needed to reduce and prevent the occurrence of overwhelming T cell activity . CAR-T-cells can destroy normal cells that express the targeted antigen and the CAR-transduced T cells that react with more than one antigen. Since CD19 antigen is expressed on healthy B cells and CD19 is used as the targeting molecule for CAR-T-cell therapy, the CARs may cause damage to healthy B cells. This type of on-target toxicity can be life-threatening, especially in patients lacking B cells.
In spite of its promising clinical results, toxicity is still a major concern for all CAR-T cell therapies. The most prevalent toxicities within a few days after infusion of CAR-T cells are cytokine release syndrome (CRS) and the destruction of healthy tissues . The CRS, also known as cytokine storms, is a flu-like systemic response to the CAR-T cells. It is caused by the widespread of T cells, their simultaneous activation, and the subsequent release of inflammatory mediators such as cytokine, chemokines, and immune effector molecules. This syndrome may induce high fever, hypotension, hypoxemia, and even organ failure. In the phase II ZUMA-1 of axicabtagene ciloleucel, CRS occurred in 94% of patients, and 13% of patients experienced 32 | JOURNYS | FALL 2018
(1) Rosenbaun L. Tragedy, perseverance, and chance-the story of CAR-T therapy. New England Journal of Medicine. 2017 Oct 5; 377(14): 1313-1315 (2) Jain MD, Davila ML. Concise review: emerging principles from the clinical application of chimeric receptor T cell therapies for B cell malignancies. Stem Cells. 2017, Oct 11. doi: 10.1002/stem.2715. (3) Mullard A. FDA approves first CAR T therapy. Nature Reviews Drug Discovery. 2017 Sep 29;16(10):669. (4) Sadelain M, Riviere I and Brentjens R. Targeting tumours with genetically enhanced T lymphocytes. Nature Reviews Cancer, 200, 3:35-45 (5) Kingwell K. CAR-T therapies drive into new terrain. Nature Reviews Drug Discovery. 2017 Apr 28; 16(5): 301-304. (6) Sadelain M, Brentjens R and Riviere I. The basic principles of chimeric antigen receptor design. Cancer Discovery 2013, 3:388-398; (7) Holzinger A, Barden M and Abken H. The growing world of CAR-T cell trials: a systematic review. Cancer Immunology Immunotherapy. 2016, 65:1433-1450 (8) Jackson HJ, Rafiq S,Brentjens RJ. Driving CAR-T-cells forward. Nature Reviews. Clinical Oncology. 2016;13:370-383. (9) Jensen MC, Riddell SR.Designing chimeric antigen receptors to effectively and safely target tumors. Current Opinion in Immunology. 2015;33:9-15. (10) Savoldo B, Ramos CA, Liu E, et al. CD28 costimulation improves expansion and persistence of chimeric antigen receptor-modified T cells in lymphoma patients. The Journal of Clinical Investigation. 2011 May;121(5):1822-1826. (11) Sadelain M, Brentjens R, and Riviere I. The promise and potential pitfalls of chimeric antigen receptors. Current Opinion in Immunology 2009, 21:215-223 (12) Gargett T and Brown MP. Different cytokine and stimulation conditions influence the expansion and immune phenotype of third-generation chimeric antigen receptor T cells specific for tumor antigen GD2. Cytotherapy. 2015, 17(4):487-495 (13) Chmielewski M, Abken H. TRUCKs: the fourth generation of CARs. Expert Opinion on Biological Therapy. 2015;15(8):11451154. (14) Nicole J, Piscopo, Katherine P. Mueller, Amritava Das, Peiman Hematti,William L. Murphy, Sean P. Palecek, Christian M. Capitini, and Krishanu Saha. Bioengineering Solutions for Manufacturing Challenges in CAR-T Cells.Biotechnology Journal 2017 Aug 25. doi: 10.1002/biot.201700095 (15) Olweus J. Manufacture of CAR-T cells in the body. Nature Biotechnology. 2017 , 35(6):520-521. (16) Golubovskaya V, CAR-T Cell Therapy: From the Bench to the Bedside. Cancers (Basel). 2017 Oct 31; 9(11). pii: E150. (17) Kenderian SS, June CH, Gill S. Generating and Expanding Autologous Chimeric Antigen Receptor T Cells from Patients with Acute Myeloid Leukemia. Methods in Molecular Biology. 2017;1633:267-276. (18) Kebriaei P, Izsvák Z, Narayanavari SA, Singh H, Ivics Z. Gene Therapy with the Sleeping Beauty Transposon System. Trends in Genetics. 2017 ,33(11):852-870. (19) Katayama H, Yasuchika K, Miyauchi Y, et al. Generation of non-viral, transgene-free hepatocyte like cells with piggyBac transposon. Scientific Report. 2017 Mar 15;7:44498. doi: 10.1038/srep44498. (20) Guha TK, Edgell DR. Applications of Alternative Nucleases in the Age of CRISPR/Cas9. International Journal of Molecular Sciences. 2017 Nov 29;18(12). pii: E2565. doi: 10.3390/ijms18122565. (21) Locke FL, Davila ML. Regulatory challenges and considerations for the clinical application of CAR-T cell anti-cancer therapy. Expert Opinion on Biological Therapy, 2017;17:659–661. (22) US Food and Drug Administration. FDA briefing document: Oncologic Drugs Advisory Committee meeting; BLA 125646; Tisagenlecleucel, Novartis Pharmaceuticals Corporation. FDA https://www.fda. gov/downloads Advisory Committees Committees Meeting Materials/ Drugs/OncologicDrugsAdvisoryCommittee/UCM566166.pdf (2017). (23) FDA Approves Second CAR-T-cell Therapy.Cancer Discovery. 2017 Nov 7. doi: 10.1158/2159-8290.CD-NB2017-155. (24) Locke FL, Neelapu SS, Bartlett NL, et al. Primary results from ZUMA-1: a pivotal trial of axicabtagene ciloleucel (Axi-cel; KTE-C19) in patients with refractory aggressive non-Hodgkin lymphoma (NHL). Proceedings of the 107th Annual Meeting American Association Cancer Research; Washington (DC): AACR; 2017. p. CT019. (25) Roberts ZJ, Better M, Bot A, Roberts MR, Ribas A. Axicabtagene ciloleucel, a first-in-class CAR-T cell therapy for aggressive NHL. Leukemia and Lymphoma. 2017 Oct 23:1-12. (26) Maude SL, Frey N, Shaw PA, et al. Chimeric antigen receptor T cells for sustained remissions in leukemia. New England Journal Medicine 2014;371:1507-17. (27) Grupp SA, Maude SL, Shaw PA, et al. Durable Remissions in Children with Relapsed/Refractory ALL Treated with T Cells Engineered with a CD19-Targeted Chimeric Antigen Receptor (CTL019). Blood 2015;126:681. (28) Brudno JN, Kochenderfer JN. Toxicities of chimeric antigen receptor T cells: Recognition and management. Blood 2016;127:3321-3330. (29) Milena Kalaitsidou, Gray Kueberuwa, Antje Schütt & David Edward Gilham. CAR-T-cell therapy: toxicity and the relevance of preclinical models. Immunotherapy. 2015; 7(5):487-97. (30) Lee DW, Gardner R, Porter DL, Louis CU, Ahmed N, Jensen M, Grupp SA, Mackall CL. Current concepts in the diagnosis and management of cytokine release syndrome. Blood. 2014;124:188-195. 33 | JOURNYS | FALL 2018
ACS San Diego Local Section
Editor in Chief Ethan Tan Assistant Editors-in-Chief Jessie Gan and Angela Liu Section Editor Katherine Izhikevich Copy Editors Arda Ulug and Daniel Kim
The San Diego Local Section of the American Chemical Society is proud to support JOURNYS. Any student in San Diego is welcome to get involved with the ACS San Diego Local Section. Find us at www.sandiegoacs.org! Here are just a few of our activities and services:
Chemistry Olympiad The International Chemistry Olympiad competition brings together the world’s most talented high school students to test their knowledge and skills in chemistry. Check out our website to find out how you can participate!
Designers Anna Jeong, Daniel Kim, Saeyeon Ju, Esther Choi Graphics Manager Seyoung Lee
Vice President William Zhang Coordinators Sua Kim, Nathaniel Chen, Allison Jung Scientist Review Board Coordinators Claire Wang and Johnny Lu Contributing Writers Do Hun Kim, Dahun Ryu, Wesley Huang, Daniel Liu, Vainavi Viswanath, Benjamin Konecny, Pulkit Rampa, Clyde Xu
Graphic Artists Seyoung Lee, Lesley Moon, Saeyeon Ju, Anna Jeong, Daniel Kim
Contributing Editors Deborah Chen, Heidi Shen, Ore James, Josephine Kim, Aaron Sun, Louise Anderfaas, Anshi Vora, Briani Zhang, Likith Palabindela, Elliot Kim, Jade Nam
Media Managers Anna Jeong and Katherine Izhikevich
Staff Advisor Mrs. Mary Ann Rall
Dr. Caroline Kumsta (Sanford Burnham Prebys Medical Discovery Institute), Dr. Kanaga Rajan (UCSD/Sanford Consortium for Regenerative Medicine), Mr. Pranjali Beri (UCSD), Dr. Ceren Yardimci Tumay (Hacettepe University), Dr. Danielle Hagstrom, Dr. Tapas Nag (All India Institute of Medical Sciences), Mr. Alexander Bakst (Qualcomm, Inc), Mr. Phillip Kyriakakis (UCSD), Mr. Daniel Garcia (UCSD), Dr. Corinne Lee-Kubli (Salk Institute), Ms. Christina Hoong (UCSD), Dr. Shannon Woodruff (HP Inc.)
ACS Project Seed
We are excited to release the first issue of JOURNYS for the 2018-19 school year! We really appreciate the patience our readers have shown for this issue as it had its fair share of delays and complications. Despite this, our wonderful JOURNYS team has worked tirelessly to piece together this magazine, and we hope you all have been inspired to pursue some aspect of STEM further or even contribute some of your own work to our next issue!
This summer internship provides economically disadvantaged high school juniors and seniors with an opportunity to work with scientist-mentors on research projects in local academic, government, and industrial laboratories.
In this particular issue, we saw an increased amount of individual research article submissions. From linguistic computer learning programs to the evaluation of of oral bacteria based on their resistance to beer of all things, research in all its forms is beneficial as it helps us to understand our ever-changing world. JOURNYS hopes to expand upon the insatiable curiosity we all possess and prove that even ordinary high school students can make a difference in the world some way or another.
College Planning Are you thinking about studying chemistry in college? Don’t know where to start? Refer to our website to learn what it takes to earn a degree in chemistry, the benefits of finding a mentor, building a professional network, and much more!
34 | JOURNYS | FALL 2018
Design Managers Anna Jeong and Daniel Kim
Co-Presidents Johnny Lu and Claire Wang
As always, we’d like to offer our gratitude to our incredible JOURNYS team for all their dedication and hard work, our advisor Mrs. Rall for her kindness and guidance, and our sponsor San Diego American Chemical Society for their generosity. And to you too, reader. Thank you for taking your time to read these pages–all of your support has helped JOURNYS thrive and grow to what it is today. We are excited for what the rest of the year has in store, and we hope that you all will be inspired to dive deeper into STEM and academia by the work of your fellow students in JOURNYS. Best, Johnny, Claire, and Ethan 35 | JOURNYS | FALL 2018
Journal of Youths in Science