Page 1

Broad S treet Scientific

Volume 2011-2012 Volume 5 | 2015-2016 Volume 11 || 2011-2012

The North North Carolina Carolina School School of of Science Science and and The Mathematics Journal Journal of of Student Student STEM STEM Research Research Mathematics

ic Volume Volume1 5| 2011-2012 | 2015-2016

The TheNorth NorthCarolina CarolinaSchool SchoolofofScience Scienceand and Mathematic's Mathematicsjournal Journalofofstudent StudentSTEM STEMresearch Research



Table of Contents 1

A Letter from the Chancellor


Broad Street Scientific Staff


Words from the Editors


Quantum Physics: A Strange World Seeking High School Students


Mechanism of Inactivation of Piezo Ion Channels


Ahmad Askar, 2017

Alisa Cui, 2016

Creation of Plasmodium falciparum Hsp90 Selective Inhibitors for Antimalarial Drug Development Vibha Puri, 2016


Development of a Functional Electrochromic Device and Syntheses of [Si(tolylterpy)2](PF6)4 and [Si(bpy)3](PF6)4 Shreya Patel, 2016


Computational Modeling of Alcoholic Fermentation in a Population of Saccharomyces cerevisiae in a Closed Environment Jack R. McCluskey, 2016 Online


Katherine Li, 2016


Gravity Wave Disturbances in the F-region Ionosphere Above Large Earthquakes

44 58

Sunwoo Yim, 2016

Margie Bruff, 2016

Combination of Microneedles and Ultrasound for the Transdermal Treatment of Melanoma Sophia Hu, 2016

Modelling Causes of Honey Bee Colony Collapse Through Population Dynamics

50 64 58

Cynthia Dong, 2016 Online

An Analysis of Recursive Properties on Counting Independent Sets of Select Graphs


Peter Cheng, 2016 Caleb Cox, 2016

Non-S-Figurate Numbers Peter Cheng, 2016 Vinit Ranjan, 2016 Kelly Zhang, 2016



Feature Article: An Interview with Maya Ajmera


Physics/Engineering Math/CompSci

Comparison of Support Vector Regression Models of Transcription Factors E2F1 and E2F4’s Binding Specificities to DNA Sequences


Conformational dynamics of HIV-1 variable loop domains for CCR5-using M-tropic and T cell-tropic viruses

Street Broad Scientific Volume 1 | 2011-2012

Letter from the Chancellor “There are those that say we cannot afford to invest in science, that support for research is somehow a luxury at moments defined by necessities. I fundamentally disagree. Science is more essential for our prosperity, our security, our health, our environment, and our quality of life than it has ever been before.” ~ President Barack Obama I am proud to introduce the fifth edition of the North Carolina School of Science and Mathematics’ (NCSSM) scientific journal, Broad Street Scientific. Each year students at NCSSM conduct significant scientific research, and Broad Street Scientific is a showcase of some of the research being done by students at NCSSM. Whether you measure its importance through breakthroughs in medicine that help cure disease, new technologies that expand our ability to understand the world around us, or impact on our nation’s economic growth, the importance of scientific research cannot be understated. NCSSM believes that by providing students with opportunities to apply their learning through research we are not only preparing and exciting students to pursue STEM degrees and careers after high school, but encouraging the next generation of innovative thinkers who can scientifically address the major challenges and problems we face in the world today and those that we will face in the future. Opened in 1980, NCSSM was the nation’s first public residential high school where students study a specialized curriculum emphasizing science and mathematics. Teaching students to do research and providing them with opportunities to conduct high-level research in biology, chemistry, physics, the applied sciences, and humanities is a critical component of NCSSM’s mission to educate academically talented students to become state, national and global leaders in science, technology, engineering and mathematics. The research showcased in this publication is an example of the significant research that students conduct each year at NCSSM under the direction of the outstanding faculty at our school and in collaboration with researchers at major universities. For thirty-one years NCSSM has showcased student research through our annual Research Symposium each spring and at major research competitions such as the Siemens Competition in Math, Science and Technology, the Intel Science Talent Search, and the International Science and Engineering Fair to name a few. The publication of Broad Street Scientific provides another opportunity to highlight the outstanding research being conducted by students each year at the North Carolina School of Science and Mathematics. I would like to thank all of the students and faculty involved in producing Broad Street Scientific, particularly faculty sponsor Dr. Jonathan Bennett and senior editors Nimit Desai, Rishi Sundaresan and Sicheng Zeng. Explore and enjoy! Sincerely, Dr. Todd Roberts, Chancellor North Carolina School of Science and Mathematics

Volume 5 | 2015-2016 | 1

Street Broad Scientific Volume 1 | 2011-2012

Broad Street Scientific Staff Chief Editors

Nimit Desai, 2016 Rishi Sundaresan, 2016

Publication Editors

Sicheng Zeng, 2016 Elizabeth Dogbe, 2017 Jennifer Fang, 2017 Avra Janz, 2017

Biology Editors

Robert Fisher, 2016 Sarah Grade, 2017 Dory Li, 2017

Physics Editors

Chase Roycroft, 2016 Sreeram Venkat, 2017

Chemistry Editors

Sayan Dutta, 2017 Cameron Herrera, 2016 Karl Westendorff, 2017

Engineering Editors

Taesoo Daniel Lee, 2016 Murali Saravanan, 2016

Math and Computer Science Editors

Nikhil Reddy, 2017 Sarah Wu, 2016


Andrew Spencer, 2016 Miguel de los Reyes, 2017

Faculty Advisor

Dr. Jonathan Bennett

2 | 2015-2016 | Volume 5

Street Broad Scientific Volume 1 | 2011-2012

Words from the Editors Welcome to the Broad Street Scientific: NCSSM’s journal of student research in science, technology, engineering, and mathematics. In this fifth edition of the Broad Street Scientific, we aim to not only showcase student research, but also to increase awareness of the importance of participation in science by demonstrating the scientific aptitude of our students to readers both inside and outside of the NCSSM community. We hope you enjoy this year’s issue. The theme for this year’s volume of Broad Street Scientific is based on neurospheres, which in layman’s terms, is a cluster of cells, some of which are neural stem cells. They are used in neurosphere assays that attempt to develop these stem cells in vitro, which can then be transplanted to regenerate various nervous tissues. These fascinating structures have the potential to revolutionize medicine. We thank the following photographers for allowing us to use their images: Dr. Alan Burns and Dr. Ellen Binder (for their image that won the the Anatomical Society Best Image Prize October 2013) (inside images), Dr. Micheal Weible (for his image that won Honorable Mention in the Olympus Bioscapes International Digital Imaging Competition 2005) (front cover), and Dr. Rick Livesey (back cover). We would like to thank the administration, faculty, and staff of NCSSM for the opportunity to pursue our research goals in the science, technology, engineering and mathematics fields. The support for student research at this school is unmatched by any other high school in the state, and the student body would like to recognize the significance of such an investment in our, and the state’s, future. We would like to specifically thank our faculty advisor, Dr. Jonathan Bennett, for his advice and guidance through the fifth edition of the Broad Street Scientific. We would also like to thank our Chancellor, Dr. Todd Roberts, Dean of Science, Dr. Amy Sheck, and Research/Mentorship Coordinator, Dr. Sarah Shoemaker, for their active support of this publication. Lastly, the Broad Street Scientific is extremely grateful to NCSSM alum Maya Ajmera, the CEO for Society for Science and the Public, for her participation in this year’s interview and insight for the next generation of scientists and entrepreneurs.

BroadStreetSci Online Volume 5 | 2015-2016 | 3

Street Broad Scientific Volume 1 | 2011-2012


Quantum Physics: A Strange World Seeking High School Students Ahmad Askar

Ahmad Askar was selected as the winner of the 2015-2016 Broad Street Scientific Essay Contest. His award included the opportunity to interview Maya Ajmera as part of the Featured Scientist section of the journal. The thought of quantum physics often makes high school students feel a shiver up their spine. This should not be happening: quantum physics should be at the crux of the daily studies of the average American high school student. Quantum physics is the lingua franca of the modern world and the tool to deciphering the mysteries of the invisible world. The field is hyperbolized to seem menacing, but at the heart of it, the concepts and theories are intuitive, and at the very least, interesting. Quantum physics should be studied by high school students all over the nation in order to graduate well-educated students into the seemingly unforgiving world of Quantum Field Theory. But first, what is quantum physics? Quantum physics is the study of wave mechanics. The de Broglie hypothesis (1924) set the stage for a new kind of physics [1]. With the relation between momentum and wavelength, classical particles can be described in terms of a wave, famously known as the wavefunction ( ) [2]. Erwin Schrodinger theorized an equation to derive the wavefunction of non-relativistic particles, known as the renowned Schrodinger Equation (1926) [1]. In simpler terms, this equation states that the sum of kinetic energy and potential energy of the wavefunction is equal to the total energy of the wavefunction, or mathematically:

From the Schrodinger Equation stems the analysis of various potential energies of the wavefunction. Quantum physics deals with solving the wavefunction for quantum particles in potential wells (E > V), which give the form =eikx to potential barriers (E < V), which gives the form =ekx [2] . It should be noted that the Schrodinger Equation can be simplified to a second order differential equation, well within the capability of a large number of high school students. It was not until the middle twentieth century that quantum physics began to make remarkable progress. Namely, the invention of operator algebra by von Neumann (1932) set the stage for firm theoretical analysis of wavefunctions [1]. Operators are mathematical matrices that were invented to analyze the properties of the wavefunctions [3]. With no analog to classical mechanics, operator methods are able to explain why electrons occupy 4 | 2015-2016 | Volume 5

discrete amount of energy values and angular momentum values, called eigenvalues. Solving the wavefunction for an electron in an atomic potential gives the result that energy and angular momentum are not only discretized but occupy integer values [2]. Extensive theoretical research has been performed on the Schrodinger Equation, and the discovery of the Scanning Tunneling Microscope and other breakthrough inventions of the twentieth century have direct consequences tied to the Schrodinger Equation [7]. More involved Quantum theory deals with assigning wavefunctions to multi-electron atoms. Douglas Hartree developed a crude technique, called Hartree Theory (1928), to find the effective potential energy of electrons in the nth state of an atom [4]. His discoveries led to the explanation of the complex electronic structure of atoms with atomic numbers larger than that of hydrogen, as well as an approximation of the separation distance between electrons and protons. However, Hartree Theory was not enough to explain atoms with high atomic numbers, mainly because the method did not account for interference between neighboring electrons, so further research needed to be done [2]. Hence HohenbergKohn (1964) developed a more comprehensive method for solving the wavefunction of molecules, which mainly included adding corrections to the potential energy term of the electron [5]. In the contemporary world, students across America are limited to only being taught classical mechanics. Although seen as the fundamentals of physics, classical mechanics is archaic physics, and now there needs to be more students learning quantum mechanics at the elementary level, for a number of reasons. First, students learning quantum mechanics will feel a smoother transition in the college theatre, where unforgiving college classes will not take the time to explain the evident quirks of Quantum theory. Furthermore, because Quantum Physics is the language of the modern day world, the research sphere needs more students that are fluent in Quantum Mechanics, and that cannot be achieved if there is no initiative to implement the study at the high school level. The majority of physics research falls into the category of condensed matter physics or optical physics, which is not existent without Quantum Physics [6]. Moreover, Quantum Physics will enhance


Street Broad Scientific Volume 1 | 2011-2012

the scope of students to think differently and approach other sciences in a more inquisitive manner. General chemistry students, for example, are taught the essence of quantum numbers, but this artificial introduction can be dramatically intensified if the students understood the reasoning behind discretized energy states and why they existed. There is no need necessarily for high schools to be teaching students how to solve higher-order differential equations, but there is a necessity to allow interested student to grasp the concepts of Quantum Physics, and educate them about the gradual rise and importance of quantum theory in the field of physics. With this in mind, the standard education board should slowly implement a Quantum Mechanics course in the curriculum. Students should first learn the postulates of the field and be introduced to the wavefunction as well the de Broglie relation. Moving forward, there should be a study of the Schrodinger equation as well as the behavior of quantum particles in various potentials. The electronic structure of Hydrogen should be included, as well as a later discussion on molecules. If America seeks to thrust itself forward into science, there is a dire need to supply high value education to its secondary school students, and Quantum Physics should be a top priority.

References [1] “A History of Quantum Mechanics.” Quantum Mechanics History. JOC/EFR, May. 1996. Web. 15 Jan. 2016. < HistTopics/The_Quantum_age_begins.html>. [2]. Eisberg, Robert Martin., and Robert Resnick. Quantum Physics of Atoms, Molecules, Solids, Nuclei, and Particles. New York: Wiley, 1985. Print. [3]. Shankar, Ramamurti. Principles of Quantum Mechanics. New York: Plenum, 1980. Print. [4]. “Physics Today.” Physics Today. AIC Publishing LLC., 2016. Web. 15 Jan. 2016. < content/aip/magazine/physicstoday>. [5]. Harrison, N. M. An Introduction to Density Functional Theory. London: Imperial College of Science and Technology, n.d. Web. [6]. “Duke Physics.” Duke Physics. Duke, 2016. Web. 15 Jan. 2016. <>. [7]. “The Scanning Tunneling Microscope.” The Scanning Tunneling Microscope. Nobel Media AB, 2016. Web. 15 Jan. 2016. < physics/microscopes/scanning/>.

Volume 5 | 2015-2016 | 5

Street Broad Scientific

Chemistry Research

Volume 1 | 2011-2012

Mechanism of Inactivation of Piezo Ion Channels Alisa Cui ABSTRACT Piezo proteins have been identified as key components of mechanosensitive ion channels in humans and other mammals. These channels are in the early stages of characterization; their mechanisms of inactivation—decrease in channel activity in the continued presence of a stimulus—remain unclear. Previous work shows that inactivation is faster for inward than outward Piezo1 currents. In the standard recording technique, two variables - voltage and direction of ion permeation - change in the transition from inward to outward currents. In this project, the concentration of permeable ions in the extracellular solution of the cell-attached method of patch-clamp electrophysiology was manipulated to shift the voltage at which ion permeation changed direction. Analysis of inactivation kinetics following this shift determined which variable was responsible for differing inactivation speeds. Distinguishing between the variables provides insight into the mechanism for inactivation of the Piezo1 ion channel; it was hypothesized that Piezo1 follows C-type inactivation, a mechanism characterized by a conformational change in the channel near the pore mouth and whose speed responds to permeable ion concentration. Data collected showed that permeable ion concentration did not significantly change inactivation kinetics, suggesting that C-type inactivation is not the mechanism of inactivation of this Piezo channel.

1. Introduction Sensation and perception, or the processes of sensing and interpreting the environment, are crucial to the survival of our species. Thanks to these processes, humans have evolved and adapted in ways that allow us to function despite the dangers posed by our environment. For example, the ability to taste bitterness dissuades us from eating potentially harmful foods, and detection of a hot surface motivates us to move our hand before it is burned [1]. A critical sense is the ability to detect mechanical stimuli; this process is not only key to sensing what is physically around us but also helps with balance and pain. When a person extends their hand to touch a table, sensory neurons relay the signal to the central nervous system, allowing the person to register that they are feeling pressure as a result of contact with the table. The focus of this project lies in the cellular basis of the phenomena of touch sensation. In particular, the methodology of patch-clamp electrophysiology allows us to analyze the first step in mechanosensation: the activation of a specific type of ion channel by environmental stimuli. Ion channels are a broad family of proteins that exist within the cell membrane to allow the passage, both inwards and outwards, of ions. These structures are responsible for the successful transduction of sensory signals; the various types of ion channels may be activated by such stimuli as sweet, savory, and bitter tastes, hyperosmolarity, and chemicals such as capsaicin [2, 3, 4]. The ion channels responsible for sensation of hot and cold, and for tastes such as capsaicin (the “hot” in hot peppers) and menthol have been isolated and studied for some time; however, the same cannot be said for mammalian mechanically activated ion channels. These mechanosensitive channels have been widely studied in prokaryotic organisms, which contain the 6 | 2015-2016 | Volume 5

families of mechanosensitive channels known as MscL and MscS- the mechanosensitive channels of large and small conductance, respectively [5]. MscL has been used to study mechanotransduction, which lead to understanding tension as the proposed stimulus in channel gating as well as the important role of MscL channels in regulating cell pressure to prevent lysis [6, 7]. However, the results of these studies do not necessarily apply to mechanosensation in mammals, and until the 2010 discovery of the Piezo protein by Coste et al., no mammalian mechanosensitive channels had been isolated and studied. Piezo1 and related Piezo2 (Fam38A and Fam38B), initially identified in the mouse neuroblastoma cell line Neuro2A, are key components of mammalian mechanosensation. The proteins are found in dorsal root ganglion (DRG) neurons as well. Piezo1 is expressed in the bladder, colon, kidney, lung, and skin, among other locations, and is activated by positive and negative pressure with sensitivity to changes in positive and negative voltage, resulting in both inwards and outwards currents [8]. Inactivation of Piezo channels, or the decrease in channel activity in the continued presence of a stimulus, is a characteristic of particular interest to this project. Earlier work with Piezo channels has shown that inactivation kinetics can be fit with a single exponential equation, yielding the time constant tau (Τinac). These Τinac) values can be compared across different conditions (pressure, voltage) and for inwards and outwards current. Certain trends have been observed, including faster kinetics of inactivation (smaller Τinac)) for Piezo2 when compared to Piezo1 and voltage dependency (once activated, current size can be influenced by positive or negative voltages) in both channel types [8, 9]. Despite this knowledge of inactivation characteristics, the precise mechanisms of inactivation for Piezo channels remain unknown. Known mechanisms of inactivation in other ion channels include N-type and C-type inactivation, which occur through very different processes and result

Street Broad Scientific

Chemistry Research in characteristic differences. N-type inactivation, known as fast inactivation, follows a “ball and chain” or “tethered ball” mechanism. It occurs when amino acids from the N-terminus of the protein bind to the intracellular mouth of the channel, essentially blocking it off from the inside [10, 11]. During C-type inactivation, a slower process, the channel undergoes a conformational change near the mouth, restricting it from the outside. C-type inactivation is sensitive to the concentration of permeable ions outside of the cell; increased concentrations promote a slower entry into and a faster recovery from inactivation [12]. Piezo channels exhibit a trend in inactivation that can be used to determine its precise mechanisms for inactivation. From Coste et al. 2010, it is known that inactivation kinetics differ in inward and outward currents; inward current (caused by a net movement of ions into the cell) experiences much faster kinetics of inactivation than outward current. Standard practice for recordings of Piezo currents use an extracellular solution with roughly the same concentration of permeable ions as inside the cell. Because Piezo is cation nonselective and allows all positively charged ions to pass through equally well, this results in a shift from inward to outward current at approximately 0 mV in standard solutions [8]. The voltage at which current changes direction from inwards to outwards is called the reversal potential. Two factors change at the reversal potential: voltage goes from negative (inward current) to positive (outward current), and the direction of ion permeation changes from inward to outward. Because the standard practice offers no way to separate the two variables, it has previously been difficult to discern which of the two is responsible for the difference in inactivation kinetics of inward and outward currents. Understanding the mechanisms of inactivation of Piezo ion channels is crucial to the full understanding of their function and the clinical application of this knowledge. For example, mutations in Piezo1 causing significantly slower kinetics of inactivation compared to the wild-type are also related to the human disease Dehydrated Hereditary Stomatocytosis (DHS), in which red blood cells are more permeable to cations and for which there is no cure [13]. Knowledge of the specific mechanisms of inactivation of Piezo1 could point to locations on the protein that should be targeted in the pursuit of more effective treatment for DHS. It was hypothesized that C-type, or slow, inactivation is the mechanism of inactivation for Piezo channels. This hypothesis is based on the speed of Piezo1 inactivation, which is slow in comparison to similar channels [14], as well as what is currently known about the structure of Piezo1. It has been predicted that Piezo1 is extremely large, with 38 transmembrane domains [15]. This size could make it difficult for anything to physically occlude the opening (the “ball” in the ball-and-chain mechanism of N-type inactivation), and there is nothing to support the existence of such a structure regardless. This leaves C-type inactivation as a more favorable possibility for the

Volume 1 | 2011-2012

mechanism of Piezo inactivation.

2. Methods 2.1 Cell culture and transfection HEK293t cells were grown at 37°C and 5% CO2 in Dulbecco’s Modified Eagle Medium. Cells were transfected with wild type mouse Piezo1- IRES-GFP 48 hours before recording and plated on glass coverslips coated in poly-L-lysine and laminin. 2.2 Pipette and bath solutions Approximately 1 mL of bath solution containing 140 mM KCl, 10 mM HEPES, 1 mM MgCl2, 10 mM glucose, and pH ≈ 7.3 was used for all recordings. Standard pipette solution contained 130 mM NaCl, 5 mM KCl, 10 mM HEPES, 1 mM CaCl2, 1 mM MgCl2, 10 mM TEACl, and had pH ≈ 7.3. NaCl in the pipette solution was replaced with N-Methyl-D-glucamine (NMDG) in 50% and 100% ratios for the 50% NMDG and 100% NMDG pipette solutions, respectively. 2.3 Recording Borosilicate glass pipettes (≈ 2-3 MΩ resistance) were used to create ≥1 GΩ seal patches in the cell-attached mode. At least n=10 cells were patched and recorded using each pipette solution. Recording was performed at room temperature with an EPC10 amplifier and HEKA Patchmaster software; pressure was controlled with an ALA Scientific High-Speed Pressure Clamp.

Figure 1. Visual representations of patches, showing the cell membrane (green), including the section inside the patch (dotted line), and pipette walls (black). Dotted blue lines represent permeable ions, and solid orange circles represent NMDG, the impermeable ion. Inward and outward currents are caused by net movement of the permeable ion in either the inward direction, as indicated by the arrow in (A), or the outward direction as indicated by the arrow in (C). (A) shows standard solution with equal amounts of permeable ion in the pipette and inside the cell; (B) shows 50% NMDG pipette solution; (C) shows 100% NMDG pipette solution. Two protocols were run: voltage steps from -100 mV to +100 mV (∆ +20mV) holding at a constant pressure of -50 mmHg; and negative pressure steps from 0 mmHg to -50 mmHg (∆ -5 mmHg) holding at a constant voltage Volume 5 | 2015-2016 | 7

Street Broad Scientific Volume 1 | 2011-2012

Chemistry Research

of -80mV (standard and 50% pipette solutions) or +80 mV (all pipette solutions). Raw current was analyzed in Igor Pro 6.0 and baseline subtracted before plotting peak currents against pressure and voltage; statistical tests performed were single factor ANOVA and Tukey post-hoc tests. Examples of the results of such protocols after initial raw current baseline subtraction are shown in Figure 2.

Figure 3. Linear fits in the form y=ax+b of mean peak current at voltages from -100 mV to +100 mV, holding at -50 mmHg for standard (n=12), 50% NMDG (n=11) and 100% NMDG (n=12) pipette solutions.

Figure 2. Representative traces of protocols run with standard pipette solution, (A) and (B) from the same cell and (C) from a single separate cell. (A) shows current at a constant pressure of -50 mmHg with modulation of voltage from -100 mV to +100 mV. Lines on the left show baselines from each 20 mV voltage step, which were subtracted in the rest of the figure. (B) shows negative pressure steps holding at -80 mV that results in inward current; (C) shows negative pressure steps holding at +80 mV that results in outward current. In (B) and (C), red steps are overlaid traces representing the pressure stimulus; each individual step corresponds to a pressure from 0 mmHg to -50 mmHg.

Values were fitted with a linear equation; calculation of the x-intercept yielded the reversal potential, or Erev. This is the voltage at which net current switches from inward to outward. As shown in the legend, reversal potentials for the three pipette solutions appear to show significant differences and become more negative from Erev=-6.1 Âą 3.5 mV to Erev -22.8 Âą 6.3 mV and finally Erev -39.6 Âą 9.6 mV as permeable ion is replaced with NMDG. Further statistical analysis of the data confirmed that Erev was indeed different in each solution, p<.05. After confirming that the different pipette solutions caused a change in the reversal potential, data from the second protocol were analyzed in order to determine if permeable ion concentration in the pipette solution affected overall channel sensitivity to pressure. Normalized peak current was averaged for all cells in each condition and plotted against pressure to create Figure 4. The data shown are from outward current at +80 mV.

3. Results In this project, the inactivation kinetics of Piezo1 mechanosensitive ion channels were analyzed in the presence of three varying ratios of permeable to impermeable ion. These changes were hypothesized to cause changes in the channel sensitivity to pressure stimuli as well as inactivation kinetics. When the cell membrane was subject to a constant negative pressure with modulations in voltage encouraging variances in current size, mean peak current at each voltage for all cells in each respective condition was plotted against voltage to create Figure 3.

Figure 4. Data points representing normalized mean peak current from 0 mmHg to -50 mmHg at a constant +80 mV were fitted with a sigmoid with equation a/(1+e (-x+P50)/k) for each pipette solution: standard (n=12), 50% NMDG (n=11), and 100% NMDG (n=12). Mean current for all cells at each pressure is shown

8 | 2015-2016 | Volume 5

Street Broad Scientific

Chemistry Research with the peak normalized to 1. Values were fitted with a sigmoid, yielding the P50, or pressure at which half of the peak current is produced. An apparent shift in P50 is noticeable between standard solution and solutions containing NMDG, but little to no difference is discernible between the P50 values of pipette solutions containing 50% or 100% NMDG. To confirm the presence or absence of a statistically significant trend, further statistical analysis was performed, yielding p values greater than the accepted standard for significance (p<.05). Results of the Tukey post-hoc analysis showed a p value of p=.09 between the standard and 50% solutions, suggesting a possible trend in P50 shift, and thus channel sensitivity to permeable ion concentration. The data show that manipulation of the permeable ion concentration in the pipette solution did not have a significant effect on the sensitivity of channel activation in response to pressure stimuli. It was hypothesized that kinetics of inactivation would undergo a more dramatic change, so the effect of permeable ion concentration in the pipette solution on inactivation kinetics was analyzed by calculating Tinac values from single exponential fit of inactivation kinetics at -100 mV and -50 mmHg pressure, plotted in Figure 5 below.

Volume 1 | 2011-2012

large and averages do not appear to follow a trend nor have statistically significant differences. This was confirmed with statistical analysis, yielding p values of p>.05. Because the different pipette solutions caused a statistically significant shift in the Erev of Piezo1, we were able to analyze inactivation kinetics of currents at the same voltage that were inwards using the standard and 50% NMDG pipette solutions but outwards in the presence of the 100% NMDG solution, results of which are shown below in Figure 6.

Figure 6. Mean traces at -20 mV and -50 mmHg in standard (n=8), 50% NMDG (n=10), and 100% NMDG (n=5) pipette solutions, with the peak normalized to one. Raw traces in standard and 50% NMDG pipette solutions were inward currents; raw traces in 100% NMDG were outward currents. Normalization was done in order to more easily compare traces across all three conditions. Visual inspection shows that inactivation kinetics for these three conditions at -20 mV vary only slightly, if at all; statistical tests of the respective Tinac values from individual cells at this voltage in each solution confirmed that p values greatly exceeded the p<.05 standard for significance.

4. Discussion

Figure 5. (A) Example trace of current from one cell at -100 mV and -50 mmHg, fitted with a single exponential function. (B) Tinac values from single exponential fit of current at -100 mV and -50 mmHg for individual cells in standard (n=11), 50% NMDG (n=11), and 100% NMDG (n=11) pipette solutions. Markers represent averages with standard error bars. It is evident that the range of Tinac values is large and scattered widely for all three conditions; standard error is

In order to analyze the effect of permeable ion concentration on inactivation kinetics in Piezo1 mechanosensitive ion channels, channel current was recorded using the method of cell-attached electrophysiology in the presence of pipette solutions with three different ratios of permeable to impermeable ions. Comparison of reversal potentials (Fig. 3) for each pipette solution shows a statistically significant shift between each different solution, serving as a confirmation that changing the pipette solutions causes a change in channel behavior. The shifts, as well as the direction of the shifts, were as expected; ions tend to move from higher to lower concentrations. Following this behavior, there would be a stronger tendency toward outward current (moving from inside the cell to the pipette solution outside the cell) in conditions with a lower permeable ion concentration in the pipette. Creating conditions where the reversal potential was shifted so far in the negative direction provides an opportunity to analyze outward current and its kinetics of inactivation independently of the change in Volume 5 | 2015-2016 | 9

Street Broad Scientific

Chemistry Research

Volume 1 | 2011-2012

sign of voltage that typically accompanies it. First, P50 values of the channel in each pipette solution were compared in order to identify possible changes in overall channel sensitivity to negative pressure stimuli as a result of changes in permeable ion concentration. Analysis of the P50 values for each condition showed a trend in differences between standard pipette solution and NMDG solutions, with no discernible difference between the two NMDG solutions (Fig. 4). This suggests that the concentration of permeable ions in the pipette solution may influence the channel’s sensitivity to activate causing outward current in response to negative pressure stimuli; however, it is unclear if the shift is real or not (p>.05). If the shift between P50 values of the standard and NMDG pipette solutions were significant, it would suggest that some sort of saturation point exists, after which decreasing the concentration of permeable ions will no longer cause a significant change in channel activity. This proposed saturation point would have already been reached at a 50% replacement. Further testing with additional varied pipette solutions containing between 0 and 50% NMDG could determine this value. Τinac) values for the fits of channel inactivation at -100mV and -50 mmHg were compared to determine if inactivation speed would be affected by the change in permeable ion concentration in the pipette (Fig. 5). While the hypothesized result was a significant shift in speed of inactivation as a response to differing concentrations of permeable ions, there was no trend evident. This argues against the proposed mechanism of C-type inactivation, since Τinac) values remain relatively the same despite differences in permeable ion concentration. A similar analysis of inactivation kinetics of current at -20 mV and -50 mmHg was performed as well. Inactivation was again quantified by a single exponential fit yielding a Τinac) value. In the standard and 50% NMDG pipette solutions, this current was inwards; use of the 100% NMDG pipette solution shifted the Erev enough so that this trace was an outward current. This outward current was produced by ion permeation in the opposite direction from inward current. Mean currents with the peak normalized to one are displayed in Figure 6, and do not appear to differ in any significant trend or manner. Single factor ANOVA and Tukey post-hoc tests confirmed that Τinac) values did not undergo a statistically significant shift, despite the change in direction of the current. The hypothesized mechanism of C-type inactivation would have resulted in a change in channel inactivation speed in response to varying concentrations of permeable ions. The data point instead to a strong argument against C-type inactivation, since inactivation kinetics at the same voltage were not affected by the change in permeable ion concentration or direction.

5. Conclusions and Future Work The goal of this experiment was to shift the Erev of 10 | 2015-2016 | Volume 5

Piezo1 mechanosensitive ion channels enough in the negative direction so that outward current could be analyzed at a negative voltage. This would allow the comparison of inactivation kinetics of the channel in the presence of differing concentrations of permeable ions at the same voltage, but with a different direction of ion permeation. Using a pipette solution in which all of the major permeable ion was replaced with NMDG, caused currents at -20 mV (holding at -50 mmHg) to be outwards currents, so that inactivation kinetics could be compared across different concentrations of permeable ion at this voltage. However, the outward currents invoked by the 100% NMDG pipette solution were very small (~10 pA), making it difficult to fit the traces with single exponential fits without large errors. Since the 100% NMDG solution still contained small amounts of other permeable ions (K+, Mg2+, and Ca2+), the Erev shift could likely be exaggerated by removal of some of these ions in an additional pipette solution, promoting a larger magnitude of peak current. Thus, while the argument for the negative hypothesis that Piezo channels do not inactivate by the C-type mechanism is strongly supported by the current data, the validity of the results cannot be completely confirmed until further experimentation with a fourth pipette solution can be performed. A proposed method of verifying the true validity of the current conclusion is to perform inside-out patches on the same Piezo1-transfected cells. The inside-out method of patch-clamp electrophysiology allows us to control the cell bath solution as well as the pipette solution; by manipulating the concentration of ions on the inside of the cell membrane while keeping the pipette solution constant, we will potentially be able to measure larger inward current at positive voltages that can be compared to outward currents at the same voltages. This is the converse of what has already been analyzed: outward current at negative voltages. It is also important to note that the data used to find the P50 values presented in this paper were from outward currents; the ions moving through the channels and causing current originated from inside the cell. Inside out patches would help in this case to confirm or deny the validity of the current conclusion that channel sensitivity to pressure is not influenced by the presence of different concentrations of permeable ions. The data recorded from cell-attached patches in this project provide a strong argument against C-type inactivation. Further experimentation using inside-out patches as described above would provide additional data that would help to support the argument against C-type inactivation that seems apparent or suggest another explanation for Piezo inactivation.

6. Acknowledgments This project would not have been possible without the laboratory space and mentoring of Dr. Jörg Grandl,

Chemistry Research the guidance of Dr. Amanda Lewis, and the support and instruction of Dr. Michael Bruno. Additional thanks to NCSSM, the NCSSM Foundation, and Dr. Sarah Shoemaker and the Summer Research Internship Program.

7. References [1] A. Fischer, “Evolution of Bitter Taste Receptors in Humans and Apes,” Mol. Biol. Evol., vol. 22, no. 3, pp. 432–436, Nov. 2004. [2] Y. Zhang, M. A. Hoon, J. Chandrashekar, K. L. Mueller, B. Cook, D. Wu, C. S. Zuker, and N. J. P. Ryba, “Coding of Sweet, Bitter, and Umami Tastes,” Cell, vol. 112, no. 3, pp. 293–301, Feb. 2003. [3] V. Denis and M. S. Cyert, “Internal Ca(2+) release in yeast is triggered by hypertonic shock and mediated by a TRP channel homologue.,” J. Cell Biol., vol. 156, no. 1, pp. 29–34, Jan. 2002. [4] M. J. Caterina, M. A. Schumacher, M. Tominaga, T. A. Rosen, J. D. Levine, and D. Julius, “The capsaicin receptor: a heat-activated ion channel in the pain pathway.,” Nature, vol. 389, no. 6653, pp. 816–24, Oct. 1997. [5] S. I. Sukharev, B. Martinac, V. Y. Arshavsky, and C. Kung, “Two types of mechanosensitive channels in the Escherichia coli cell envelope: solubilization and functional reconstitution.,” Biophys. J., vol. 65, no. 1, pp. 177–83, Jul. 1993. [6] I. Iscla and P. Blount, “Sensing and responding to membrane tension: the bacterial MscL channel as a model system.,” Biophys. J., vol. 103, no. 2, pp. 169–74, Jul. 2012. [7] B. Nilius and E. Honoré, “Sensing pressure with ion channels.,” Trends Neurosci., vol. 35, no. 8, pp. 477–86, Aug. 2012. [8] B. Coste, J. Mathur, M. Schmidt, T. J. Earley, S. Ranade, M. J. Petrus, A. E. Dubin, and A. Patapoutian, “Piezo1 and Piezo2 are essential components of distinct mechanically activated cation channels.,” Science, vol. 330, no. 6000, pp. 55–60, Oct. 2010. [9] P. A. Gottlieb and F. Sachs, “Piezo1: properties of a cation selective mechanical channel.,” Channels (Austin)., vol. 6, no. 4, pp. 214–9, Jan. 2012. [10] T. Hoshi, W. N. Zagotta, and R. W. Aldrich, “Biophysical and molecular mechanisms of Shaker potassium channel inactivation.,” Science, vol. 250, no. 4980, pp. 533–8, Oct. 1990. [11] K. L. Choi, R. W. Aldrich, and G. Yellen, “Tetraethylammonium blockade distinguishes two inactivation mechanisms in voltage-activated K+ channels.,” Proc. Natl. Acad. Sci. U. S. A., vol. 88, no. 12, pp. 5092–5, Jun. 1991. [12] L. Kiss and S. J. Korn, “Modulation of C-type inactivation by K+ at the potassium channel selectivity filter.,” Biophys. J., vol. 74, no. 4, pp. 1840–9, Apr. 1998. [13] J. Albuisson, S. E. Murthy, M. Bandell, B. Coste, H. Louis-Dit-Picard, J. Mathur, M. Fénéant-Thibault, G. Tertian, J.-P. de Jaureguiberry, P.-Y. Syfuss, S. Cahalan,

Street Broad Scientific Volume 1 | 2011-2012

L. Garçon, F. Toutain, P. Simon Rohrlich, J. Delaunay, V. Picard, X. Jeunemaitre, and A. Patapoutian, “Dehydrated hereditary stomatocytosis linked to gain-of-function mutations in mechanically activated PIEZO1 ion channels.,” Nat. Commun., vol. 4, p. 1884, Jan. 2013. [14] W. Ulbricht, “Sodium channel inactivation: molecular determinants and modulation.,” Physiol. Rev., vol. 85, no. 4, pp. 1271–301, Oct. 2005. [15] B. Coste, S. E. Murthy, J. Mathur, M. Schmidt, Y. Mechioukhi, P. Delmas, and A. Patapoutian, “Piezo1 ion channel pore properties are dictated by C-terminal region.,” Nat. Commun., vol. 6, p. 7223, Jan. 2015.

Volume 5 | 2015-2016 | 11

Street Broad Scientific

Chemistry Research

Volume 1 | 2011-2012

Development of a Functional Electrochromic Device and Syntheses of [Si(tolylterpy)2](PF6)4 and [Si(bpy)3](PF6)4 Shreya Patel

ABSTRACT Every year, U.S. citizens spend most of their energy bill on heating and cooling. This high demand for energy is associated with varying global temperatures and production of greenhouse gases. The electrochromic window is an emerging appliance that is an effective way to decrease energy consumption. Also known as smart windows, these windows use their electrochromic properties to change colors, which affect the passage of light through them, and therefore act as an alternative for regulating temperature. Electrochromic windows have been synthesized using transition metal-based compounds; however, these metals are rare and expensive. Some compounds, known as viologens, exhibit electrochromic properties, but are extremely toxic. The goal of this project was to create an electrochromic device that was inexpensive, earth-abundant, and safe for humans using hexacoordinate polypyridylsilicon complexes. The existing compound, [Si(bpy)3](PF6)4 , and the novel compound, [Si(tolylterpy)2](PF6)4, were synthesized in this experiment by following and modifying existing procedures. The spectroelectrochemical properties of these compounds were characterized after performing redox reactions to the compounds in a degassed spectroelectrochemical cell. The silicon complex, [Si(bpy)3](PF6)4, was used to develop the first functional electrochromic device from a polypyridylsilicon(IV) complex. Experimentation continues, but the data gathered shows that hexacoordinate polypyridylsilicon complexes exhibit electrochromic activity when undergoing low-voltage redox reactions, and thereby can be possibly used in electrochromic devices.

1. Introduction Since temperature regulation has been not just a desired, but necessary standard in commercial settings, it has proved to be a considerable use of energy. Most U. S. citizens spend a large portion (roughly 48%) of their utility bill on heating and cooling, thus making it a principal source of energy expenditure [1]. In addition, households and commercial buildings made up approximately 41% of the United Statesâ&#x20AC;&#x2122; total energy consumption in 2014 [1]. However, the most heavily consumed energy sources for such use are typically non-renewable and, consequently, unsustainable. The use of these sources, such as fossil fuels, is also the culprit of global warming, which further complicates the issue of temperature regulation in its current state. To reduce the costs of temperature regulation, individuals and companies have begun to take initiative to be â&#x20AC;&#x153;greenâ&#x20AC;?. By installing more efficient, eco-friendly appliances for heating and cooling, consumers are able to alleviate the need for temperature maintenance and associated energy expenditure. Typically, modern buildings are insulated with wool, recyclable paper, or aluminum foil in order to retain or remove heat [1]. The usage of energy conservative kitchen and washing appliances has also become more prevalent. In addition, renewable energy has become a more supported and popular source of electricity to power homes and other structures. Currently, a new, emerging technology is the smart window. A smart window or switchable glass operates on its electrochromic properties, in which a material changes color upon application of electrical voltage. These windows save energy because they can change from colorless to opaque with the flick of a switch, thus retaining light in 12 | 2015-2016 | Volume 5

order to heat room using the greenhouse effect, or reflect light to cool a room. In turn, this environmentally friendly technology has the capacity to decrease energy consumption. However, electrochromic windows are very expensive because they are currently developed with sparse and dangerous materials. Research groups have synthesized electrochromic devices using alkylated 4,4'-bipyridines, also called viologens, which are toxic and commonly used in pesticides [2]. Transition metal compounds have also been used as the active component of electrochromic devices [3]. However, most transition metals are typically rare and, expensive and unsustainable. Viologens possess electrochromic properties as well, however, they are extremely toxic. Commonly used as an herbicide, N,N'-dimethyl-4,4'-bipyridinium dichloride, also called Paraquat, is a viologen that is so toxic that it has been banned from many countries [2]. Other electrochromic compounds are ruthenium analogs of our materials. Although they are effective cathodic colorants, they are very rare. In addition, ruthenium dyes do not possess a colorless state while silicon analogs do [8]. In this research project, we synthesized hexacoordinate polypyridylsilicon complexes in order to create a small-scale, functional electrochromic device. The first is a compound known to have electrochromic properties, ([Si(bpy)3](PF6)4. The novel compound which this will be compared to is [Si(tolylterpy)2](PF6)4. The syntheses of these compounds are attractive because they exhibit a broad range of colors when undergoing voltage-induced redox reactions. Another benefit of hexacoordinate silicon complexes is that silicon is earth abundant and, therefore, relatively inexpensive and sustainable [4]. Synthesizing and utilizing hexacoordinate polypyridyl silicon complexes with a +4 charge is useful because they are easily

Chemistry Research reduced and switch color with low applied voltage due to their electron deficient outer shells. Therefore, they can ultimately be used to develop very efficient functional electrochromic window with a broad range of colors and uses.

2. Materials and Methods 2.1 Synthesis 1 In order to develop a functional electrochromic window, we synthesized the hexacoordinate polypyridyl silicon complex, tris(2,2'-bipyridyl)silicon(IV) hexafluorophosphate ([Si(bpy)3](PF6)4). The complex ion, [Si(bpy)3]4+ , is shown in Figure 1.

Figure 1. This figure is a 2-D representation of the structure of Si(bpy)34+ . The compound we used in this research project was Si(bpy)34+ with 4 (PF6)- counter-ions [3]. We synthesized [Si(bpy)3](PF6)4 following the procedures specified in Suthar, et al. [4]. In this procedure, 1.00 g of SiI4 (1.87 mmol), 1.16 g of 2,2'-bipyridine (7.47 mol, 4 eq.), and 5.00 g of 2-picoline were combined in a glass pressure tube under an inert atmosphere of pure nitrogen in a sealed glove box. We then sealed the ampoule, removed it from the glove box, and placed it in an oil bath for three hours at approximately 125 ˚C. The resulting precipitate was a dark brown solid, [Si(bpy)3]I4. After cooling, the product was rinsed with 2.00 mL of methanol three times to remove any partially substituted product. It was then rinsed with 2.00 mL of chloroform to remove unreacted ligand. Lastly, [Si(bpy)3]I4 was washed with diethyl ether and placed in a vacuum chamber to dry. In order to convert [Si(bpy)3]I4 to a hexafluorophosphate salt, 0.1 g of [Si(bpy)3]I4 were dissolved in 5.00 mL of deionized water followed by centrifugation and decantation of the solution. Approximately 1.00 mL of saturated NH4PF6 solution was added to the decanted solution. The yellow precipitate, [Si(bpy)3](PF6)4, was dissolved in 5.00 mL of DI water three more times to remove impurities. The off-white precipitate was left to dry in the vacuum overnight. To perform the spectroelectrochemistry of [Si(bpy)3] (PF6)4, a spectroelectrochemical cell from Pine Research

Street Broad Scientific Volume 1 | 2011-2012

Instrumentation, as shown in Figure 2, was used. The cell consisted of a quartz cuvette with a 1.0 mm pathlength, a gold honeycomb working electrode and a gold counter electrode. It also contained a Ag/AgCl reference electrode. The samples of [Si(bpy)3](PF6)4 (2.8x10-4 M) were prepared in anhydrous acetonitrile with NBu₄PF₆ (0.1 M). The solutions were implemented into a Princeton Applied Research potentiostat/galvanostat and analyzed with an Agilent 8453 Diode Array Spectrometer [6].

Figure 2. The figure shows where electrons can bind on the Si(bpy)34+ complex. When completely oxidized in a degassed environment, the honeycomb cell turns clear. Applying a voltage of approximately 100 mV is enough to start reducing the Si(bpy)34+ to its reduced “green” state [5]. 2.2 Device Development from [Si(bpy)3](PF6)4 The electrochromic device consisted of two planes of conducting fluorine doped tin oxide (FTO) glass compressed together with an O-ring spacer and filled with an electrolyte solution. It was necessary to immobilize the dye on the anode so that it would not drift to the cathode where it would be oxidized back to the colorless state. To accomplish this, 5.4 mg of [Si(bpy)3](PF6)4 were dissolved in approximately 2.00 mL of water. This solution was then pipetted onto one of the FTO slides which had been placed on a hot plate at 85°C. Upon evaporation, a film was left on the anode and an O-ring was placed above it. The O-ring acted as a spacer between the two slides, which were held together by two binder clips. A saturated solution of the electrolyte, tetrabutylammonium hexafluorophosphate (NBu4PF6), filled the enclosure created by the O-ring. In order to change the color of the device, a Princeton Applied Research model 173 potentiostat/ galvanostat, was hooked up to the device. The reference and working electrodes were connected to the same glass (anode) while the counter electrode was connected to the cathode, as shown in Figure 3 [5]. Application of a negative potential of 100 mV led to a color change in the device from nearly colorless to red.

Volume 5 | 2015-2016 | 13

Street Broad Scientific

Chemistry Research

Volume 1 | 2011-2012

Figure 3. This is a first generation device that shows electrochromic activity in a non-degassed environment. The oxidation state is pictured on the top and the reduction state is pictured on the bottom. 2.3 Synthesis 2 A second complex, bis(4'-tolyl-2,2':6'6''-terpyridyl) silicon(IV) hexafluorophosphate, [Si(tolylterpy)2](PF6)4, was synthesized by following procedures specified by Liu, et al. to first produce the ligand tolylterpyridine (Ttpy) [7]. After the synthesis of the ligand, the procedure by Suthar, et al. was modified to form the iodide salt of the complex. The hexafluorophosphate salt of the complex was made by dissolving the iodide complex in water and adding ammonium hexafluorophosphate to precipitate the desired solid. The complex ion, [Si(ttpy)2]+4, is shown in Figure 4.

Figure 4. This figure is a 2-D representation of the complex ion structure of [Si(ttpy)3]4+ [6]. As specified by Liu, et al., to make Ttpy, 4.84 g (40mmol) of 2-acetylpyridine and 2.40 g (20 mmol) 4-methylbenzaldehyde were added to 100mL of ethanol with a magnetic stir bar [9]. Next, 1.60g (40mmol) 14 | 2015-2016 | Volume 5

of NaOH and 65.0 mL of ammonia water (25%) were added into the solution, which was stirred at 35 °C for one day. Once cooled, the white precipitate was collected by filtration, recrystallized three times with ethanol, and then subsequently dried in a vacuum. According to the paper by Suthar et al., 1.31g (5.60 mmol, 3 molar equivalents) of Ttpy, 1.00g (1.90 mmol) of SiI4, and 5.90g of pyridine were mixed together in an ampoule [6]. The ampoule was placed in an oil bath at 125˚C for 3 hours. The precipitate was a dark brown solid, [Si(tolylterpy)2]I4 .After cooling the solution, the product was rinsed with methanol to remove partially substituted byproducts. It was then rinsed with chloroform and diethyl ether, respectively, and dried in a vacuum. Dissolving 40 mg of [Si(tolylterpy)2]I4 in 25 mL of water, and mixing with NH4PF6 in water precipitated the hexafluorophosphate salt, [Si(tolylterpy)2](PF6)4. The yellow colored precipitate was centrifuged then rinsed with water, ethanol, and ether before drying in a vacuum.

3. Results 3.1 Results- Synthesis 1 [Si(bpy)3]I4 as a dark brown solid was typically produced in 60 % yield or better, and conversion to the hexafluorophosphate salt, [Si(bpy)3](PF6)4, also proceeded at about 60 % yield. Differences in the solubility of the complexes affected this yield depending on the number and nature of the rinsing steps. Tris(bipyridyl)silicon(IV) was reduced with a potentiostat in acetonitrile with tetra-nbutylammonium hexafluorophosphate to obtain the UVvis spectra of its reduced species. The fully oxidized state (4+) was colorless, while the reversible reduced states (3+, 2+, 1+) were green, as shown in the degassed spectroelectrochemical honeycomb cell in Figure 2. 3.2 Results- Device Development from [Si(bpy)3] (PF6)4 In the first trial of device development, the solution of [Si(bpy)3] (PF6)4 and water was pipetted directly onto the FTO slide and then the slide was left to dry in air. However, this method was very ineffective because the solution would slip off of the slide. Furthermore, in an attempt to hold the slides together and create a spacer, epoxy and double-sided tape were used, however, the epoxy did not dry well and the tape’s adhesiveness was not robust. Therefore, this method of window development was aesthetically unappealing and defective. The second trial of device development was much more successful. In response to the moving solution, the slide was placed on a hot plate while the solution was slowly pipetted onto the slide, evaporating off the water. In order to improve the spacer and untidy appearance of the slide, a thin, rubber O- ring for a lid was used to separate the slides and contain the electrolyte. Also, two binder clips were used to hold the slides together. This method was very effective because the window was uncluttered, separated, and easier

Street Broad Scientific

Chemistry Research to view. As low voltage (-100 mV) was applied to the device, the glass turned light yellow when oxidized and a dark brown color when reduced. These partial reduction states are shown in the degassed honeycomb cell in Figure 5. The window was not colorless to begin with because the environment was not degassed. These results show that hexacoordinate polypyridyl silicon complexes with a +4 charge have the capacity to change color with low applied voltage due to their electron deficient outer shells. They can be used in the development of a functional electrochromic device. Although successful from a device standpoint, from previous experiments with this dye it was known that formation of a brown color indicates a reaction of the reduced dye with oxygen. Consequently, in subsequent devices, it will be necessary to seal the device in an inert atmosphere to exclude oxygen.

Volume 1 | 2011-2012

trochromic properties, allowing for the development of devices that could be used in energy-efficient applications. In a degassed environment, [Si(bpy)3]4+ can be completely reduced and oxidized, thus allowing for the switch from colorless to green and back again, as seen in Figure 2. However, the reduced states are susceptible to oxygen and appear to react with it to form a yellowish/brown product (Figure 5). We built a prototype device using two conductive FTO slides with silicon-based dye deposited on the anode. It successfully exhibited an electrochromic response. However, the response clearly indicated the presence of oxygen in the cell. This problem could be prevented by preparing the device in an inert atmosphere, such as a glove box, and sealing it with an air tight sealant. There are several epoxides for example that could be tested for this purpose. The new compound, [Si(tolylterpy)2]4+, synthesized in our research group exhibited electrochromic properties similar to [Si(bpy)3]4+. When oxidized, the compound turned colorless and when reduced, the compound became darker, as shown in the degassed cell in Figure 6. However, since this is in the presence of oxygen, further experimentation is needed to assure these colors.

5. Conclusions and Future Work Figure 5. This figure shows the color of [Si(bpy)3]4+ at different reduction potentials when degassed in a honeycomb cell in the presence of oxygen. 3.3 Results- Synthesis 2 The Ttpy was a white precipitate with a yield of 26 %. The [Si(tolylterpy)2]I4 was a dark brown precipitate with a yield of about 80 % . The yield of [Si(tolylterpy)2] (PF6)4, a yellow color, was about 60 %, due to the solubility issues mentioned above. Spectroelectrochemistry of [Si(tolylterpy)2]4+ was performed on the compound. The honeycomb cell in Figure 6 shows the oxidation and reduction of [Si(tolylterpy)2]4+ along with accompanying color change.

Figure 6. This figure shows the color of [Si(tolylterpy)2]4+ at different reduction potentials when degassed in a honeycomb cell in the presence of oxygen.

4. Discussion The results of my project demonstrate that polypyridylsilicon(IV) type compounds experience elec-

In this research, two hexacoordinate silicon complexes were synthesized, one from previous literature procedures and one novel complex. In addition, the spectroelectrochemical properties of each compound were further investigated, allowing for the development of the first functional prototype of an electrochromic device based on polypyridylsilicon(IV) compounds. In this research, we also learned that when Si(bpy)34+ is in a reduced state it reacts with oxygen to generate an unidentified yellow/ brown species. Therefore, when developing an electrochromic device it is necessary to carefully exclude all oxygen. In the near future, we plan to build a cell inside a glove box filled with an inert atmosphere of nitrogen or argon. Then, we will seal the device with an epoxide coating to prevent oxygen from getting in. This second-generation prototype device should exhibit the reversible colorless to green transition observed in the degassed spectroelectrochemical cell (Figure 2). Although improvements are needed, we successfully built and tested the first functional electrochromic device from a hexacoordinate polypyridyl silicon complex. These results could be used to synthesize large-scale electrochromic windows in vehicles and buildings. In addition, hexacoordinate polypyridyl silicon complexes could potentially be used as dyes for printing on electrochromic glass. Future work with the chemistry and device could lead to cheap, sustainable electrochromic windows with a wide range of color control and design. Some day you might be looking at the world through rose colored glasses made possible by an electrochromic polypyridylsilicon(IV) device. Volume 5 | 2015-2016 | 15

Street Broad Scientific Volume 1 | 2011-2012

6. Acknowledgements I would like to thank Derek M. Peloquin, Domelia R. Dewitt, Dr. Jon W. Merkert, Dr. Bernadette T. DonovanMerkert, and Dr. Thomas A. Schmedake for their contributions. I would also like to thank the University of North Carolina-Charlotte for providing the lab space and materials for this project.

7. References [1] Department of Energy. (n.d.). Retrieved September 20, 2015, from [2] R. Mortimer and T. Varley, Novel Color-Reinforcing Electrochromic Device Based on Surface-Confined Ruthenium Purple and Solution-Phase Methyl Viologen, 2011, 4077-4080. [3] Davies, E. (2011, January). Critical Thinking. Chemistry World. Retrieved September 2, from http://www. [4] B. Suthar, A. Aldongarov, I. S. Irgibaeva, M. Moazzen, B. T. Donovan-Merkert, J. W. Merkert and T. A. Schmedake, Polyhedron, 2012, 31, 754-758. [5] H. Goto, H. Yoneyama, F. Togashi, R. Ohta, A. Tsujimoto, E. Kita, and K. Ohshima, Preparation of Conducting Polymers by Electrochemical Methods and Demonstration of a Polymer Battery, 2008, 1067-1070. [6] D. Peloquin, D. Dewitt, S. Patel, J. Merkert, B. Donovan-Merkert, and T. Schmedake, Spectroelectrochemistry of tris(bipyridyl)silicon(IV): ligand localized reductions with potential electrochromic applications, 2015, 2-4. [7] X. Liu, J. Xu, Y. Lv, W. Wu, W. Liu and Y. Tang, An ATP-selective, lanthanide complex luminescent probe, Dalton Trans., 2013, 42, 9840. [8] C. M. Elliot and J. G. Redepenning, Stability and Response Studies of Multicolor Electrochromic Polymers Modified Electrodes Prepared From Tris(5,5â&#x20AC;&#x2122;-Dicarboxyester-2,2â&#x20AC;&#x2122;-Bi-pyridine)Ruthenium(II), J. Electroanal. Chem., 1986, 219-232.

16 | 2015-2016 | Volume 5

Chemistry Research

Street Broad Scientific

Chemistry Research

Volume 1 | 2011-2012

Creation of Plasmodium falciparum Hsp90 Selective Inhibitors for Antimalarial Drug Development Vibha Puri


Malaria remains one of the largest public health challenges to this day with an estimated 3.3 billion people at risk of infection, causing 584,000 deaths in 2014 alone. The Plasmodium falciparum parasite causes the most dangerous form of malaria and has developed resistance to almost all current antimalarial drugs. This study aims to provide a solid basis for developing new drugs to treat malaria. We did this by targeting the chaperone protein P.falciparum heat shock protein 90 (PfHsp90) which is responsible for the proper folding of multiple integral proteins in the organism and is essential to the erythrocytic life cycle of the parasite; inhibition of PfHsp90 effectively impedes parasite development. Geldanamycin is a naturally occurring, potent inhibitor of Hsp90, but is not selective enough for PfHsp90 over human host Hsp90 (HsHsp90), impeding its potential as a therapeutic. We designed and created 45 structural analogs of geldanamycin to increase selectivity. Based on computational testing, a number of promising candidates were identified for drug testing; 19 of the created inhibitors were predicted to inhibit PfHsp90 with greater selectivity, enabling possible future therapeutic use. The quinone moiety of geldanamycin was also found to be integral for selective binding to PfHsp90. In conclusion, we have identified a mechanism of selectivity which can help in the creation of other potential inhibitors and proposed selective, effective inhibitors for use in the synthesis of new drugs to combat this dangerous disease.

1. Introduction 1.1 Malaria and Plasmodium falciparum Malaria poses a large public health challenge worldwide with an estimated 3.3 billion people at risk of infection as of 2014 [1]. Of these, 1.2 billion are at high risk, having greater than 1 in 1000 chance of contracting malaria in the next year. This is an increase from 2013, when 198 million cases of malaria occurred globally and caused 584,000 deaths. The most prominent region affected is the African region where an estimated 90% of all malarial deaths occur, mainly in children under 5 years old who account for 78% of all deaths [1]. Malaria is caused by the Plasmodium parasite which includes six species known to cause disease in humans [2]. Of these, Plasmodium falciparum is the most dangerous form with the highest rate of mortality [3]. In this study, we aim to provide a solid basis for the synthesis of a new antimalarial drug against the Plasmodium falciparum strain of malaria. The infection is spread through a female mosquito vector. Malaria consists of two stages: liver-stage and bloodstage infection. The liver-stage is an asymptomatic prerequisite for the blood-stage as the parasite first infects the liver as sporozoites and multiplies within the liver cells. Next merozoites (daughter parasites) are released from the liver into the bloodstream, infecting erythrocytes (red blood cells) in a cyclical stage inwhich symptoms develop [4]. Since malariaâ&#x20AC;&#x2122;s symptoms only present themselves in the blood stage, many of the current treatments target the blood stage of malaria [5]. However, targeting the asymptomatic liver-stage offers a prophylactic therapeutic ad-

vantage; the target we exploit with our inhibitors provides the potential for dual-stage inhibition. 1.2 Current Treatments Chemoprevention, including intermittent preventive treatment in pregnancy (IPTp) and intermittent preventive treatment for infants (IPTi), has been shown to be effective in pregnant women and young children. This reduces the likelihood of perinatal mortality, maternal anemia, and low birth weight in pregnant women and provides protection for infants within the first year of their life against clinical malaria and anemia [1]. While this is effective as preventative treatment, there is still an unmet need in therapeutics for individuals already infected. Artemisinin-based combination therapy (ACT) is the current standard for treating malaria. However, P. falciparum resistance to artemisinin has been detected in five countries [1]. In addition, P. falciparum has developed resistance to nearly all common antimalarial drugs such as chloroquine, mefloquine, and halofanrine, further combatting efforts to develop effective treatments [6]. The strain exhibits multi-drugresistance, creating a need for a new drug with a target unexploited in popular antimalarials. 1.3 Heat Shock Protein 90 (Hsp90) Our proposed therapeutic target was heat shock protein 90, or Hsp90. This chaperone protein is highly conserved across eukaryotic species ranging in complexity from simple yeasts to humans [7]. In malaria, Hsp90 is expressed by both the P. falciparum parasite (PfHsp90) and human host (HsHsp90) [8]. Volume 5 | 2015-2016 | 17

Street Broad Scientific Volume 1 | 2011-2012

Hsp90 protects stressed cells by properly folding client proteins involved in cell survival. During the erythrocytic Plasmodium life cycle, or infection of human blood cells by the parasite, cells are particularly stressed, inducing overexpression of PfHsp90. Furthermore, in all cells, stressed or unstressed, Hsp90 is a chaperone protein responsible for the folding of many client proteins integral to the organism including transcription factors and protein kinases. By inhibiting Hsp90, these client proteins are unable to fold into their native conformations and are degraded by proteases, leaving the organism unable to perform integral functions [8]. Inhibiting PfHsp90 has shown promise in a study by Banumathy et al. which found that treatment with geldanamycin impeded parasite growth [9]. While Hsp90 is a known blood-stage parasite target, it is likely critical to liver-stage development as well, allowing for Hsp90 inhibitors to potentially function as dualstage malarial inhibitors [10]. Hsp90 inhibitors also have the potential to be used in combination therapies with current antimalarials to circumvent resistance [8]. Hsp90 consists of three domains: the N terminal ATPbinding domain, a middle domain for ATP turnover, and the C-terminal dimerization domain [7]. The component of Hsp90 most interesting to us is the N-terminal ATP binding domain (Figure 1).

Chemistry Research achieve the goal of target-only binding [18]. Furthermore, geldanamycin exhibits high hepatotoxicity [13]. A suitable drug would have to exhibit higher selectivity for P. falciparum Hsp90, able to be effective at a low enough dosage for minimal toxic effects. One method proposed to reduce geldanamycin toxicity is to remove the quinone moiety, as removal of such from macbecin, another Hsp90 inhibitor, was shown to significantly decrease toxicity [13]. This is because the quinone moiety undergoes a redox reaction which generates toxic radicals within the body [14]. 1.5 Structural Analogs Several structural analogs of geldanamycin have been designed to improve its therapeutic potential and are currently in clinical trials. One example of a structural analog of geldanamycin is 17-N-allylamino-17-demethoxygeldanamycin, or 17-AAG, which only differs the GA at the C-17 position (Figure 2).

Figure 2: The chemical structure of geldanamycin (left) and its analog 17-AAG (right) [8]

Figure 1: ADP bound to the N-Terminal of Hsp90 [7] The majority of common Hsp90 inhibitors bind this site, effectively preventing ATP from binding. Because ATP binding is necessary for the conformational change of Hsp90 that enables its chaperoning functions, this prevents the protein from carrying it outs essential role [7]. HsHsp90 and PfHsp90 are highly conserved across species, with 69% sequence similarity between the two [9]. However, there are three major residue differences between the two within the ATP binding site: Val186 of HsHsp90 is replaced by Ile173 in PfHsp90, Ser52 by Ala38, and Lys112 by Arg98 [11]. In this study, we aimed to design inhibitors that are selective for PfHsp90 by exploiting these differences. 1.4 Geldanamycin Geldanamycin, a natural product from Streptomyces hygroscopicus, is a well-known, potent Hsp90 inhibitor [12]. However, it is currently only mildly selective for PfHsp90 over HsHsp90 [19]. Many side effects can occur when a drug binds to proteins that share a similar binding site; hence, drugs should have a very high affinity to the specific Plasmodium target, allowing a low dosage to selectively 18 | 2015-2016 | Volume 5

The substitution of the methoxy group with an amine greatly improves geldanamycinâ&#x20AC;&#x2122;s toxilogical profile, as 17-AAG exhibits lower hepatoxicity [13]. Similarly, the analog 17Dimethylaminoethylamino-17-demethoxygeldanamycin, or 17-DMAG, was created as a watersoluble derivative of 17-AAG [15]. Eventually, once we design a more selective inhibitor, we can work on improving its solubility and other properties with further structural modifications. To create viable structural analogs, the chemical properties must be considered. The potential interactions they will have within the Hsp90 ATP binding pocket are important in designing reasonable analogs to exploit structural differences, resulting in selectivity. Furthermore, positions that can be modified via known biosynthetic pathway mutations should be preferentially considered; only C-7, C-15, C-17, and C-19, known alterable positions on geldanamycin, were modified in the inhibitors proposed in this study [12][13][15][20][21].

2. Materials and Methods 2.1 Defining the Binding Site Computational models of Homo sapiens and Plasmodium falciparum Hsp90 were generated using known crystallized structures of the receptors bound to geldanamy-

Street Broad Scientific

Chemistry Research cin (PDB1YET) and ADP (PDB3K60), respectively. The receptors were processed in Autodock Tools to remove water molecules and any ligands attached and add hydrogens. Key residues at the N-terminal ATP-binding site, identified by Corbett and Burger, were treated as flexible to make the computational model more realistic and reflective of behavior in vivo [7]. HsHsp90 and PfHsp90 were prepared as receptors for docking in Autodock Tools by setting grid box parameters corresponding to the binding pocket for the known, natural ligands within the N-terminal domain (Figure 3). The parameters were validated by overlaying our models with the known crystal structures of natural ligands docked within each receptor.

Figure 3: The grid box around the binding site of PfHsp90 Each structural analog created was docked into HsHsp90 and PfHsp90 using AutoDock Vina with the grid box parameters to predict binding affinities and conformations [16]. 2.2 Creating, Processing, and Docking Structural Analogs To design the proposed structural analogs of geldanamycin, the software ChemDraw Professional was used. These files were then converted into pdbqt files, 3D models of the chemical structures. The 3D models were then processed using Autodock Tools; gasteiger charges were added according to the software defaults and torsions were enforced upon the ligand to be more representative of binding in vitro. Each of these molecules was then docked into each receptor using Autodock Vina. A batch program was created to dock each ligand in quick succession to maximize efficiency and increase throughput without the decrease in quality that would occur by running them in parallel. 2.3 Evaluating Analogs Autodock Vina calculates the binding affinities of the most likely conformations of a ligand docked in a receptor as well as outputs a file visualizing the docked ligand. The output files from Autodock Vina were analyzed in Py-

Volume 1 | 2011-2012

Mol. The molecular interactions between the ligand and receptor were visualized to interpret the selectivity of each analog by identifying polar contacts between the ligand and receptor. Based on a set of criteria, promising inhibitors were designated as primary or secondary candidates.

3. Results The geldanamycin analogs created demonstrated a wide range of results in terms of their interactions with the parasite protein and human host. Each analog was visualized using PyMol and polar interactions identified. For example, C17-1 was shown to only interact with the residue Lys58 in HsHsp90 while interacting with Asp40, Asn37, Thr101, Arg98, and Lys44 in PfHsp90, indicative of higher selectivity for PfHsp90 (Figure 4).

Figure 4: C17-1 interacting with choice residues in the Homo sapiens (left) and Plasmodium falciparum Hsp90 receptor (right). The ligand is shown as multicolored sticks while the receptor is shown in blue; the residues the ligand interacts with are highlighted in purple and interactions between the ligand and receptor shown as dotted lines. Many modifications were made which both increased and decreased selectivity. Alterations includedadding hydroxyl groups with or without methyl linkers, removing the quinone moiety, adding halogens, carboxylic acid, amine groups, and amide groups. The binding affinities of each of the inhibitors for each receptor and the residues the ligands were predicted to interact with were compiled and compared for analysis. Six primary candidates (Table 1) and 13 secondary candidates (Appendix 1) were identified from the 45 molecules created. The six primary candidates created were C17-1, C176, C17-10, C17-11, C17-13, and C7-6. All of these analogs interacted with malarial specific residues, exhibited greater binding affinities for PfHsp90 over HsHsp90, and had stronger networks of polar connections with PfHsp90 than with HsHsp90, all indicative of increased selectivity in vitro. The 13 secondary candidates proposed (Appendix 1) similarly interacted with malarial specific residues. They also had a stronger network of interactions with PfHsp90 in comparison to HsHsp90, though less so than the primary candidates. Though their predicted binding affinities to PfHsp90 were not greater than those for HsHsp90, these analogs exhibited increased binding affinity for PfHsp90 and decreased binding affinity for HsHsp90 relative to wild-type geldanamycin. Volume 5 | 2015-2016 | 19

Street Broad Scientific Volume 1 | 2011-2012

Chemistry Research with a higher number of predicted interactions were considered better than those with fewer polar contacts between the ligand and receptor. Compared to the original structure of geldanamycin (Figure 5), the five most promising analogs created were C7-6, C17-1, C17-6, C17-10, C17-11, and C17-13 (Figure 6).

Figure 5: The chemical structures of geldanamycin on which the analogs created were based

Table 1: The primary candidates’ chemical structures, binding affinities for PfHsp90 and HsHsp90, and residues of interaction with each. The malarial-specific residues are written in red text. A number in parentheses is indicative of multiple predicted polar connections to that residue. All of these molecules exercised greater selectivity for PfHsp90 than wild-type geldanamycinbased on three criteria. (1) Their binding affinities for the PfHsp90 receptor are greater than those for HsHsp90, especially in comparison to wild-type geldanamycin in the first row of the table (2) They interact with malarial specific residues (3) Each molecule interacts with a greater number of residues in PfHsp90 than HsHsp90

4. Discussion

The criteria for the selection of promising analogs were as follows. Firstly, the analog should have predicted interactions with malarial specific residues. Secondly, there should be an increase in binding affinity for PfHsp90 and a decrease in affinity for HsHsp90 relative to wildtype geldanamycin. The number of interactions between the ligand and receptor were also taken into account; analogs 20 | 2015-2016 | Volume 5

Figure 6: The chemical structures of the six primary candidates for further exploration as selective Plasmodium falciparum Hsp90 inhibitors All of these molecules interacted with the malarialspecific residue Arg98, thereby exploiting one of the few residue differences between HsHsp90 and PfHsp90 within the binding pocket. The binding affinities of all these analogs not only increased for PfHsp90 and decreased for HsHsp90 in relation to wild-type geldanamycin, but had a greater predicted binding affinity for PfHsp90 over HsHsp90, indicating high selectivity. The number of interactions each of these molecules had with PfHsp90 was also significantly greater than that with HsHsp90. C7-6’s added hydroxyl and methyl linker pushed the molecule into a conformation that fit tightly within PfHsp90’s binding pocket (Figure 7). C17-1’s five predicted polar interactions with PfHsp90 in comparison to one to HsHsp90 are especially promising, as well as its 31% increase in affinity for PfHsp90 coupled with an 8% decrease in affinity for HsHsp90 relative to wild-type geldanamycin. C17-11 produced similar results with seven interactions with PfHsp90 and two with HsHsp90, featuring a 11% decrease in affinity for HsHsp90 and 34% increase in affinity for PfHsp90. Nei-

Street Broad Scientific

Chemistry Research ther the added hydroxyl or amine group in C17-1 and C17-1 respectively were predicted to interact with the receptor directly but instead allowed for the ligand to obtain a conformation within the receptor that optimized other bonds.

Figure 7: (Left) Wild-type geldanamycin docked in the binding pocket of the PfHsp90 receptor. (Right) Created analog C7-6 docked in the binding pocket of the PfHsp90 receptor. The substitution of a fluorine for a methyl group in C17-6 was hypothesized to greatly change the behavior of the molecule because of fluorine’s inherent electronegativity. It seems to have done so in a beneficial way, pushing the molecule into conformations with greater opportunity for stronger bonds and interactions with Arg98. The introduction of fluorine has also allowed for more interactions within PfHsp90 in comparison to HsHsp90, with five polar contacts to the parasite protein versus two to the human host. Fluorine is also often added to enhance solubility, improving the therapeutic potential of geldanamycin [17]. C17-10 featured the replacement of a methoxy group with carboxylic acid. This provided not only an additional opportunity for hydrogen bonding but provided a longer side chain to extend into the binding pocket as this new carboxylic acid interacted with the Lys44 residue in PfHsp90. C17-13 functioned similarly, interacting with Lys44 with its hydroxyl attached to a methyl linker. Aside from these six primary candidates, thirteen secondary candidates were also chosen for further exploration based on the criteria set forth (Figure 8).

Volume 1 | 2011-2012

binding affinity for PfHsp90 over HsHsp90, their binding affinities increased for PfHsp90 and decreased for HsHsp90 in relation to wild-type geldanamycin. These analogs also featured a stronger network of interactions within the PfHsp90 binding pocket over HsHsp90’s, though not to the same extent as the primary candidates. Overall, these analogs show promise as their predicted selectivity for the parasite is greaterthan wild-type geldanamycin’s and should be explored further because slightly different behavior may be observed in vitro rather than in silico. Furthermore, possible improved therapeutic properties of these molecules resulting from the modifications made may far outweigh the slight decrease in selectivity in comparison to the primary candidates. An analog of each of these most promising candidates lacking the quinone moiety was also tested in molecular docking simulations. The quinone moiety is thought to be a major source of toxicity for geldanamycin since the removal of such has been shown to decrease toxicity [13]. Interestingly, in 22 out of the 24 total analogs that interacted with Arg98, the malarial-specific reside, the oxygen within the quinone moiety was the element on geldanamycin predicted to bind to this residue. Furthermore, the majority of non-quinone counterparts of these inhibitors caused increased binding affinity and number of polar contacts for HsHsp90, decreasing selectivity of these proposed analogs significantly. Thus, while non-quinone compounds were potent Hsp90 inhibitors in other studies, removing the quinone moiety does not seem like a feasible contribution to creating Plasmodium falciparum selective inhibitors. These inhibitors, in targeting Hsp90, are versatile. Though the Hsp90 17-AAG was created to more effectively target Hsp90 in tumors, it has also been shown to have effects against Alzheimer’s [22]. Thus, it is quite possible that in creating inhibitors designed for Plasmodium falciparum Hsp90 these molecules could have implications for cancer and other diseases’ treatment. Furthermore, since Hsp90 is present at all stages of the malarial life cycle, these inhibitors could be used to treat not only the blood-stage of malaria like current antimalarial drugs, but also the liver-stage, functioning as prophylactics. These inhibitors also have the potential to be used synergistically with current antimalarials to circumvent resistance.

5. Conclusions and Future Work

Figure 8: The secondary candidates for further exploration as selective Plasmodium falciparum Hsp90 inhibitors These analogs also interacted with the malarial specific residue, Arg98. Though they did not have a greater

Structural modifications to geldanamycin resulted in 19 selective inhibitors of Plasmodium falciparum Hsp90. These inhibitors interacted with malarial specific residues, had a significantly stronger network of interactions with PfHsp90 than with HsHsp90, and had higher predicted binding affinities for PfHsp90 and lower affinities for HsHsp90 in comparison to wildtype geldanamycin in silico. Though the quinone moiety is often targeted to decrease toxicity, we found that it significantly contributed to selectivity for PfHsp90, influencing us to look for alternative methods to improve the therapeutic properties of Volume 5 | 2015-2016 | 21

Street Broad Scientific

Chemistry Research

Volume 1 | 2011-2012

the inhibitors. The selective inhibitors proposed can provide the basis of new drugs to target PfHsp90 for antimalar-ial effects.Targeting Hsp90 with these inhibitors could provide a large range of beneficial effects since Hsp90 is critical to the development of the parasite within the human host. Drugs based off of these structures could also be used synergistically with current antimalarials to circumvent resistance. Our work provides a basis for scientists to synthesize drugs that should then be further tested for human use. In future work we plan to explore the proposed analogs in vitro. The compounds proposed will be synthesized and tested in competitive binding assays to measure their binding affinity for PfHsp90 and HsHsp90; solubility, toxicity, and other therapeutic properties of these molecules will be explored as well.

6. Appendix 1

Appendix 1: The secondary candidatesâ&#x20AC;&#x2122; chemical structures, binding affinities for PfHsp90 and HsHsp90, and residues of interaction with each. The malarial-specific residues are written in red text. A number in parentheses is indicative of multiple predicted polar connections to that residue. All of these molecules exercised greater selectivity for PfHsp90 than wild-type geldanamycin based on three criteria. (1) Their binding affinities for the HsHsp90 receptor decrease in absolute value and binding affinities for the PfHsp90 receptor increase in absolute value in comparison to wild-type geldanamycin in the first row of the table (2) They interact with malarial specific residues (3) Each molecule interacts with a greater number of residues in PfHsp90 than HsHsp90

7. Acknowledgment I would like to thank Dr.Emily Derbyshire, Principal Investigator at Duke, and Allison Keim, Graduate Student at Duke, for their discussion and verification of the research and edits to the manuscript of this report. I would also like to thank Dr.Myra Halpin, Chemistry Instructor at the North Carolina Schoolof Science and Mathematics, for providing edits to the manuscript of the report.

8. References [1] World Health Organization. (2014). World Ma22 | 2015-2016 | Volume 5

Chemistry Research laria Report 2014. Retrieved from malaria/publications/world_malaria_report_2014/wmr2014-noprofiles.pdf ?ua=1 [2] Subudhi, A. K., Boopathi, P. A., Pandey, I., Kaur, R., Middha, S., Acharya, J., … Das, A. (2015). Disease specific modules and hub genes for intervention strategies: A coexpression network based approach for Plasmodium falciparum clinical isolates. Infection, Genetics and Evolution : Journal of Molecular Epidemiology and Evolutionary Genetics in Infectious Diseases, 35, 96–108. [3] Perlmann, P., & Troye-Blomberg, M. (2000). Malaria blood-stage infection and its control by the immune system. Folia Biologica, 46(6), 210–8. Retrieved from [4] Center for Disease Control and Prevention. (2012). Malaria: Biology. Retrieved September 6, 2015, from [5] Delves, M., Plouffe, D., Scheurer, C., Meister, S., Wittlin, S., Winzeler, E. A., … Leroy, D. (2012). The Activities of Current Antimalarial Drugs on the Life Cycle Stages of Plasmodium: a Comparative Study with Human and Rodent Parasites. PLoS Medicine, 9(2), e1001169. http:// [6] Center for Disease Control and Prevention. (2012). Drug Resistance in the Malaria-Endemic World. Retrieved from [7] Corbett, K. D., & Berger, J. M. (2010). Structure of the ATP-binding domain of Plasmodium falciparum Hsp90. Proteins, 78(13), 2738–44. prot.22799 [8] Shahinas, D., Folefoc, A., & Pillai, D. R. (2013). Targeting Plasmodium falciparum Hsp90: Towards Reversing Antimalarial Resistance. Pathogens (Basel, Switzerland), 2(1), 33– 54. [9] Banumathy, G., Singh, V., Pavithra, S. R., & Tatu, U. (2003). Heat shock protein 90 function is essential for Plasmodium falciparum growth in human erythrocytes. The Journal of Biological Chemistry, 278(20), 18336–45. [10] Derbyshire,E.R.,Prudêncio,M.,Mota,M.M.,& Clardy, J. (2012). Liver-stage malaria parasites vulnerable to diverse chemical scaffolds. Proceedings of the National Academy of Sciences of the United States of America, 109(22), 8511–6. [11] Wang, T., Bisson, W. H., Mäser, P., Scapozza, L., & Picard, D. (2014). Differences in conformational dynamics between Plasmodium falciparum and human Hsp90 orthologues enable the structure-based discovery of pathogen-selective inhibitors. Journal of Medicinal Chemistry, 57(6), 2524–35. jm401801t [12] Shin, J.-C., Na, Z., Lee, D.-H., Kim, W.-C., Lee,

Street Broad Scientific Volume 1 | 2011-2012

K., Shen, Y.-M., … Lee, J.-J. (2008). Characterization of Tailoring Genes Involved in the Modification of Geldanamycin Polyketide in Streptomyces hygroscopicus JCM4427. Journal of Microbiology and Biotechnology, 18(6), 1101–8. Retrieved from http://www.ncbi.nlm.nih. gov/pubmed/18600054 [13] Kim, W., Lee, D., Hong, S. S., Na, Z., Shin, J. C., Roh, S. H., … Hong, Y.-S. (2009). Rational biosynthetic engineering for optimization of geldanamycin analogues. Chembiochem : A European Journal of Chemical Biology, 10(7), 1243–51. [14] Shadle, S. E., Bammel, B. P., Cusack, B. J., Knighton, R. A., Olson, S. J., Mushlin, P. S., & lson, R. D. O. (2000). Daunorubicin cardiotoxicity. Biochemical Pharmacology, 60(10), 1435–1444. [15] Smith, V., Sausville, E. A., Camalier, R. F., Fiebig, H.-H., & Burger, A. M. (2005). Comparison of 17-dimethylaminoethylamino-17-demethoxy-geldanamycin (17DMAG) and 17-allylamino-17-demethoxygeldanamycin (17AAG) in vitro: effects on Hsp90 and client proteins in melanoma models. Cancer Chemotherapy and Pharmacology, 56(2), 126–37. s00280-004-0947-2 [16] O. Trott, A. J. Olson, AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization and multithreading, Journal of Computational Chemistry 31 (2010) 455-461 [17] Hsiao, S.-H., & Lin, J.-Y. (2015). Synthesis and electrochromic properties of novel aromatic fluorinated poly(ether-imide)s bearing anthraquinone units. Journal of Fluorine Chemistry, 178, 115–130. http://doi. org/10.1016/j.jfluchem.2015.07.012 [18] Nussinov, R., & Tsai, C.-J. (n.d.). The Different Ways through Which Specificity Works in Orthosteric and Allosteric Drugs. Retrieved October 27, 2015, from http:// [19] Pallavi, R., Roy, N., Nageshan, R. K., Talukdar, P., Pavithra, S. R., Reddy, R., … Tatu, U. (2010). Heat shock protein 90 as a drug target against protozoan infections: biochemical characterization of HSP90 from Plasmodium falciparum and Trypanosoma evansi and evaluation of its inhibitor as a candidate drug. The Jour-nal of Biological Chemistry, 285(49), 37964–75. [20] Rastelli, G., Tian, Z.-Q., Wang, Z., Myles, D., & Liu, Y. (2005). Structure-based design of 7-carbamate analogs of geldanamycin. Bioorganic & Medicinal Chemistry Letters, 15(22), 5016–21. bmcl.2005.08.013 [21] Kitson, R. R. A., Chang, C.-H., Xiong, R., Williams, H. E. L., Davis, A. L., Lewis, W., … Moody, C. J. (2013). Synthesis of 19-substituted geldanamycins with altered conformations and their binding to heatshock protein Hsp90. Nature Chemistry, 5(4), 307–314. http://doi. Volume 5 | 2015-2016 | 23

Street Broad Scientific Volume 1 | 2011-2012

org/10.1038/nchem.1596 [22] Ho, S. W., Tsui, Y. T. C., Wong, T. T., Cheung, S. K.K., Goggins, W. B., Yi, L. M., â&#x20AC;Ś Baum, L. (2013). Effects of 17-allylamino-17-demethoxygeldanamycin (17-AAG) in transgenic mouse models of frontotemporal lobar degeneration and Alzheimerâ&#x20AC;&#x2122;s disease. Translational Neurodegeneration, 2(1), 24.

24 | 2015-2016 | Volume 5

Chemistry Research

Street Broad Scientific

Biology Research

Volume 1 | 2011-2012

Computational Modeling of Alcoholic Fermentation in a Population of Saccharomyces cerevisiae in a Closed Environment Jack R. McCluskey ABSTRACT The use of Saccharomyces cerevisiae in processes such as viticulture and brewing is widespread throughout human history. This particular species of yeast is used to ferment carbohydrates such as glucose and fructose into ethanol via anaerobic alcoholic fermentation. Here a computational model of alcoholic fermentation in a population of Saccharomyces cerevisiae in a closed environment is created. By using differential equations in the program VenSim, a series of models simulating the fermentation of glucose into ethanol and carbon dioxide by a dynamic population of Saccharomyces cerevisiae was generated. These models displayed a close relationship between population size, ethanol creation, and decreasing environmental carrying capacity. Increasing the accuracy of variables used and expanding the number of carbohydrates utilized by the model could increase its accuracy of as well as create applications in the brewing and viticulture industries.

1. Introduction Saccharomyces cerevisiae, is a unicellular form of yeast in the whose primary application for human beings is its use in the creation of ethanol for consumption. While many organisms prefer to undergo aerobic respiration due to its larger energy output per mole of glucose combusted, many forms of yeast produce energy almost exclusively through anaerobic respiration. While anaerobic respiration results in the formation of lactic acid in many organisms, S. cerevisiae undergoes alcoholic fermentation to form ethanol. The formation of ethanol from glucose is represented by the equation below:

C6H12O6 -> 2C2H5OH + 2CO2


This reaction has been utilized in the processes of baking, brewing, and wine-making by humans throughout history, becoming the cornerstone of multiple industries around the globe. Building an understanding of the numerous factors, both genetic and environmental, which influence the processes of population growth and alcoholic respiration can enable these industries to produce higher-quality products with lower research and development costs. How can alcoholic fermentation in a population of S. cerevisiae be computationally modeled?

previous model. It is assumed that the only carbohydrate available for fermentation is glucose for simplicity, although other carbohydrates such as fructose are known to be usable by S. cerevisiae for alcoholic fermentation. Some variables were set to arbitrarily assigned numbers in order to test the functionality of preliminary models. All models also assumed that the population was placed in a closed environment, into which nothing could be added nor could anything be removed.

3. Methods The first model created for this research was based upon a basic alcoholic fermentation equation (1) and utilized many assumptions; this model was meant to lay the groundwork upon which a more complex model would be constructed. As shown in figure 1 the model only takes into account initial values for glucose, carbon dioxide, and ethanol. This model does not attempt to vary the rate at which fermentation occurs, instead opting to process one mole (180.156 grams) of glucose during a single run. The model was programmed to run for fifteen minutes at a time step of .0625, unrealistic values for this process as more time would be necessary to process the given amount of glucose in a real-world situation.

2. Approach A model for alcoholic fermentation in a population of S. cerevisiae in a closed environment was created in the systems dynamics program VenSim [7]. Models were created via an iterative process, with each new iteration increasing accuracy by expanding on the scope of the

Table 1. Basic Fermentation Model Equations As a proof of concept, the Mark I model worked well. Volume 5 | 2015-2016 | 25

Street Broad Scientific Volume 1 | 2011-2012

Biology Research

By providing a basis for alcoholic fermentation, a simulated population of yeast could now be implemented into the model. Table 2. Mark II Fermentation Model Equations With a rough form of the alcoholic fermentation model created, focus was then placed on creating a model of the yeast population. This was performed by implementing the Verhulst-Pearl equation [3], shown below:

Fig. 1. Alcoholic Fermentation Model Mark I, providing the basic dynamics of an alcoholic fermentation equation

Fig. 2. Alcoholic Fermentation Model Mark II, which set the rate of fermentation to match a given energy demand of a population The first step in the process of combining population and fermentation was taking into account a stated energy demand of the population per unit of time. In order to survive organisms require energy, and the energy demand of a population determines how much glucose is fermented at any given time. The Mark II version ofthe model simply factored in this energy demand by converting energy demand per unit of time in kilojoules to moles of glucose required by the following equation:


150 kilojoules is the amount of energy produced by anaerobic respiration from a single mole of glucose [5]. This stoichiometric conversion is the basis for determining how much glucose will be fermented per unit of time in each of the following models.

26 | 2015-2016 | Volume 5


In this equation, “t” represents time, “P” represents the population, “r” represents the growth rate of the population, and “K” represents the carrying capacity of the environment. A simple proof of concept of this model, shown in figure 3, in action was created with an initial population of 10 organisms, a carrying capacity of 1000 organisms, and a growth rate of 0.1 (or 10 percent per day); this model was run for 100 days with a time step of 0.0625. These numbers were chosen arbitrarily and are only placeholders for the proof of concept. Note that this model accounts for population growth only and does not address the death of organisms.

Fig. 3. Population Model Proof of Concept S. cerevisiae and other variants of yeast have been shown to be sensitive to higher concentrations of ethanol due to a variety of genetic factors [4]. Increasing ethanol concentrations can lower the carrying capacity of a given environment. Figure 4 models this decrease asit applies to the Verhulst-Pearl equation. K was changed to equal 1000 - (.75 * Time); the other arbitrary values used in the previous proof-of-concept were retained. Death was defined to occur only if the population exceeded the carrying capacity of the model and acted as a means to prevent population size from growing beyond the environmental carrying capacity. With a functioning population model now available, the focus was shifted to integrating the fermentation and population models into a rough version of the final model. The Mark I (first) combined model simply placed the second population model into the second fermentation model and was tested to ensure that the numbers and interactions had been transferred properly. Thiswas done by maintaining the same variable values and running the

Street Broad Scientific

Biology Research

Volume 1 | 2011-2012

model to ensure the same results were obtained as the separate model runs. An image for reference (Figure 5) has been included, however no mathematical changes were made to either component of the model.

Fig. 6. Combined Model Mark II, which began relating fermentation and the yeast population Fig. 4. Population Model With Decreasing Carrying Capacity

Fig. 7. Combined Model Mark III

Fig. 5. Combined Model Mark I, combining the population and fermentation models The Mark II version of the combined model (Figure 6) was a preliminary step towards combining these two components. The primary connection made was between population size and energy demand; thevariable “Energy Demand Per Cell” was created to enable manipulation of energy demand based upon the population. This led to the variable “Moles of Glucose Required” being represented by the equation MolesGlucose = . In this equation, Ed represents the energy demand per cell and “P” represents the population size. Once again, values used in the population portion of the model are unchanged from the arbitrary values assigned previously. The variable “Energy Demand Per Cell” was set arbitrarily to 0.0005 kJ per day. The Mark III form of the model cleaned up the model and converted certain values into more practical units necessary for further improvements. This required that carbon dioxide and ethanol be represented in milliliters (a much more practical unit than grams) of gas and liquid respectively and that the variable of “Total Volume of Container” be included in the model. This allows the concentration of ethanol by volume to be mathematically calculated and be related to decreasing carrying capacity of the environment. For reference, a graphic of the Mark III model (Figure 7) is included.

The Mark IV model (Figure 8) finalized each section of the model with a few optimizations and the replacement of many arbitrary and/or inaccurate numbers. One change was the implementation of the Combined Gas Law in order to determine the actual volume of carbon dioxide produced through fermentation. Equation 4 (4) modifies in order to deter-mine the volume of gas per mole of carbon dioxide:


Fig. 8. Combined Model Mark IV, which replaced a variety of arbitrary values In the above equation, “T” represents temperature, “P” represents pressure, and “V” represents volume. This equation replaced the value of 22.4 liters per mole of CO2. That condition only occurs at standard temperature and pressure and is an unrealistic scenario. Significant changes to population growth were also important features of the Mark IV model. This included Volume 5 | 2015-2016 | 27

Street Broad Scientific

Biology Research

Volume 1 | 2011-2012

the inclusion of a death rate as well as the establishment of a set relationship between percent ethanol by volumem and carrying capacity. The death rate was a rough calculation based on studies by Fabrizio and Longo on yeast lifespans in aquatic environments [2]. Their research indicated that the expected lifespan of a single cell would be approximately twenty days. In order to account for deviations from this approximation, a daily death rate of 5% was chosen. The relationship between percent ethanol by volume and carrying capacity was built upon the work of Da Silva et al. [1] This research indicated that the maximum ethanol level at which life for a population of yeast remained viable was approximately 8%, however a reference indicating levels as high as 15% had been measured previously. While the value of 15% could not be verified by resources available during the development of this model, for ease of mathematics the estimate of 10% ethanol by volume was used as the point where the population of Saccharomyces cerevisiae would no longer be viable. The equation for carrying capacity became the following:

4. Discussion Each iteration of the model continued to expand the factors considered during the simulation of a population of Saccharomyces cerevisiae undergoing alcoholic fermentation in a closed environment. The basic model for alcoholic fermentation, created to provide a framework for future models, graphically displayed therelationship between the three measured variables of glucose, ethanol, and carbon dioxide (Figure 9). This model was tested with no ethanol or carbon dioxide present in the environment prior to simulation and ten moles of glucose (approximately 1801.56 grams) available. Glucose visually decreases in a linear manner while carbon dioxide and ethanol increase linearly. The second version of this model introduced the variable of energy demand and displayed the same relationship.

CarryingCapaciy = InitialCap - (InitialCap * (10 * %Ethanol)) (5) The rate of population growth was also changed from an arbitrary number to a scientifically-based estimate. The maximum life span of strains of yeast was mathematically determined to be approximately sixty-seven divisions [6]. When divided over the twenty daylifespan determined by Fabrizio and Longo [2], this resulted in a maximum growth rate estimate of 3.35 that was utilized in the model. The final change from the Mark III model to the Mark IV version was the inclusion of necessary fail-safes. A restriction on Glucose level was put into place, halting the conversion of glucose into ethanol and carbon dioxide when the initial supply of glucose was depleted. Prior to this the model would have dipped into negative values for amount of glucose if energy demand was high and the initial glucose supply low. This unrealistic behavior was eliminated with the use of a simple â&#x20AC;&#x153;If Then Elseâ&#x20AC;? command. A similar change was put in place with population growth, halting the growth of the population when there was no glucose available. These adjustments made the model much more realistic than previous versions.

Fig. 9. Basic Fermentation Model Output The basic population growth model (3), based upon the Verhulst- Pearl equation (4), performed as expected. Over the duration of one hundred days the population grew to carrying capacity following a typical logistic curve. This duration of one hundred days became the standard duration of time for which each subsequent model was run. For reference, the output from this model with a growth rate of 0.1, an initial population of 10 cells, and a carrying capacity of 1000 cells is included (Figure 10).

Fig. 10. Basic Population Model Output

Table 3. Mark IV Model Equations 28 | 2015-2016 | Volume 5

The implementation of a decreasing carrying capacity in the second stand-alone population model displayed expected behavior as well. The decay has a noticeable impact on the growth of the population, as despite its exponential

Street Broad Scientific

Biology Research growth the total number of organisms fails to reach the initial carrying capacity of 1000 organisms. This effect is visible in the resulting output from this model run with the same values for growth rate, initial population, and initial carrying capacity as the basic population model (Figure 11).

Volume 1 | 2011-2012

to make is that this model contains a unit shift from grams to milliliters of carbon dioxide. This is the reason why the curve on the output is much steeper than previous models.

Table 5. Mark III Model Parameters

Fig. 11. Population Model With Decaying Carrying Capacity Output The Mark I combined model did not change any aspect of the fermentation or population models, however the Mark II model associated population size and energy demand. This model was run with the parameters in Table 4. This model displayed the weakness of the model without failsafes introduced in the Mark IV model. The continual increase of carbon dioxide and ethanol despite the lack of glucose to ferment was a significant issue (Figure 12). The increased glucose supply and decreased energy

Table 4. Mark II Model Parameters

Fig. 13. Mark III Model Output The decaying carrying capacity was noticeable in the final output graph from this simulation (Figure 14). While not refined and using an arbitrary mathematical relationship with percent ethanol by volume, this illustrates the interaction that was desired by this model. While the resulting percent ethanol by volume from this model was not very large, the linear increase follows what is expected from previous models. The Mark IV model was run with significantly different parameters than previous models. These are listed in the Table 6.

Fig. 12. Mark II Model Output demand per day created a significant difference once the model was simulated (Figure 13), as did the implementation of a decaying carrying capacity. One important note

Fig. 14. Mark III Population Output The if-then fail-safes put into place were utilized in the running of the Mark IV model. The glucose supply ran out after approximately 34.875 days. The halting of fermentation can be seen in the output graph from the Volume 5 | 2015-2016 | 29

Street Broad Scientific

Biology Research

Volume 1 | 2011-2012

model (Figure 16). The percent ethanol by volume (Figure 17) followed the expected trend of growth as the population and rate of

Fig. 15. Mark III Percent Ethanol Output

Fig. 17. Mark IV Percent Ethanol Output

Fig. 18. Mark IV Carrying Capacity Table 6. Mark IV Model Parameters

Fig. 19. Mark IV Population Fig. 16. Mark IV Output

5. Conclusion

fermentation increased, halting only when the supply of glucose ran out. The carrying capacity followed a similar trend (Figure 18). However, the change in population size (Figure 19) is the most interesting portion of this model. The calculated growth rate resulted in enormous population growth within the first three days of the simulation; however this enormous growth resulted in the rapid decline of carrying capacity, reducing the maximum number of viable cells. This caused the population to follow the curve of the carrying capacity until fermentation stopped. At this point the populationâ&#x20AC;&#x2122;s decay was furthered by the death rate until the entire population had died out.

Through increasingly complex computational models of population growth and alcoholic fermentation undergone by a population of Saccharomyces cerevisiae in a closed environment, a series of interactions can be visualized. Populations of S. cerevisiae, despite producing ethanol as a byproduct of respiration performedto produce energy, are also incredibly sensitive to concentrations of ethanol. As concentrations of ethanol and carbon dioxide increase, the available supply of glucose and the carrying capacity of the environment both decline. Allowed to continue long enough, this process causes the entire population of S. cerevisiae to die. Un-

30 | 2015-2016 | Volume 5

Biology Research derstanding this interaction and being able to represent it mathematically is essential for modeling this system. Refining variables enabled the model to become more accurate and be represented in more realistic ways. Converting the representation of carbon dioxide from grams to milliliters placed the value in a form more conducive to a gas. Later, accounting for variances in temperature and air pressure moved the model away from assuming Standard Temperature and Pressure. Ensuring that units are realistic and that values of variables are applicable for realworld situations is important if this model is to be useful beyond theoretical application. Many refinements could still be made to this model. Determination of the energy demand for one S. cerevisiae cell over one day would significantly increase the accuracy of the model; the value of 5e- 005 kilojoules per day is one of the few remaining arbitrary values remaining in the model. Because this number directly impacts the moles of glucose required for a given day, it is a key value at which changes significantly alter the results of a simulation. Another improvement would be the adjustment of the growth and death rates of the population to more accurate values. While scientifically based, the values used are little more than estimations based upon available scientific data. These are also values that could be influenced by genetic traits as well as varying environmental factors, all of which would have to be accounted for in some way. An extension of this model would include taking into account a wider variety of carbohydrates that are usable by S. cerevisiae. While this model focuses exclusively on glucose, other carbohydrates such as fructose are also known to be used during alcoholic fermentation. Constructing a model that takes into account the presence of various carbohydrates would greatly expand the size and scope of this research; however that model would also be significantly more useful in the brewing and viticulture industries due to its broader reach.

Street Broad Scientific Volume 1 | 2011-2012

[3]Garnier, J., and Qu´e Telet, A. Correspondance math´ematique et physique. No. v. 10. Impr. d’H. Vandekerckhove, 1838. [4]Ma, M., and Liu, Z. L. Mechanisms of ethanol tolerance in saccharomyces cerevisiae. Applied Microbiology and Biotechnology 87, 3 (2010), 829–845. [5]Roberts, M., Reiss, M. J., and Monger, G. Advanced Biology. Nelson Thornes, 2000. [6]Sinclair, D., Mills, K., and Guarente, L. Aging in Saccharomyces cerevisiae. Annual review of microbiology 52 (1998), 533–560. [7]Ventana Systems, I. Vensim ple software., 2006.

6. Acknowledgements The author would like to thank Robert Gotwals for his guidance in this research, and the North Carolina School of Science and Math for providing software and academic resources. Without these, this research would not have been possible. Thanks are also extended to Dr. Robin Boltz for assistance in obtaining research materials as well as Keara Halpern and Bryan Hayes for theirinput.

7. References [1]Da Silva, R. O., Batistote, M., and Cereda, M. P. Alcoholic fermentation by the wild yeasts under thermal, osmotic and ethanol stress. Brazilian Archives of Biology and Technology 56, 2 (2013), 161–169. [2]Fabrizio, P., and Longo, V. D. The chronological life span of Saccharomyces cerevisiae. Methods in molecular biology (Clifton, N.J.) 371 (2007), 89–95. Volume 5 | 2015-2016 | 31

Street Broad Scientific

Biology Research

Volume 1 | 2011-2012

Conformational dynamics of HIV-1 variable loop domains for CCR5-using M-tropic and T cell-tropic viruses Katherine Li ABSTRACT The effectiveness of current HIV-1 therapies is limited by the virusâ&#x20AC;&#x2122;s ability to mutate rapidly and establish latency in a diverse population of host cells. To overcome this obstacle, a deeper understanding of the biochemical principles that allow the virus to undergo tropism switching and gain entry into alternative cell types is required. The HIV-1 envelope protein, also known as Env, is the sole mediator of viral attachment and entry into host immune cells. While the ability of HIV-1 to enter CD4+ dense cell types, such as T cells (T cell-tropic viruses), has been well characterized, viral variants capable of entering cell populations with reduced CD4+ densities, such as macrophages (M-tropic viruses), are less understood. The goal of this study is to develop and interrogate pseudo-type Env clonal isolates of paired M-tropic and T-cell tropic viruses. The generation of paired clonal isolates of both M-tropic and T-cell tropic viruses was achieved in addition to the development of paired mutants containing fluorophore recognizing peptide sequences in the Env variable loop domains 1, 2 and 4 (V1/V2, V4). In the long-term, these mutant viruses will be imaged using smFRET to identify structural differences in HIV-1 Env that aid in the regulation of HIV tropic preference. These findings are an important step toward understanding the role of HIV-1 M-tropic viruses in pathogenesis.

1. Introduction Nearly 39 million people have died from HIV associated illnesses since 1981. As of 2013, there are approximately 35 million people worldwide living with HIV, with an estimated 2.1 million new cases reported each year [1]. HIV is a single-stranded RNA virus belonging to the family of retroviridae [2] that targets the human immune system and results in the development of acquired immune deficiency syndrome (AIDS). At present, there are only two clinical strains of HIV, designated as HIV-I and HIV-2. HIV-1 is the more globally predominant strain and therefore represents a priority in therapeutic research [3]. Current treatment models against HIV-1 involve the use of antiretroviral (ARV) drug cocktails to suppress viral expression and slow the onset of AIDS. Unfortunately, suppressive therapies of this nature are expensive and do not offer a long-term solution for managing chronic infections. While intensive research has been done to develop a cure for HIV, the ability of the virus to undergo rapid mutation and establish latency significantly hinders the development of more sustainable treatment models. Increased understanding of the biological mechanisms that allow HIV-1 to undergo host cell expansion and establish not yet characterized latent reservoirs is essential for the development of more effective treatment models. To accomplish this goal, a better understanding of the biochemical pathways that regulate host cell attachment and entry must be achieved. The HIV-1 envelope (Env) is solely responsible for the attachment of HIV-1 to host cells. The diversity in the 32 | 2015-2016 | Volume 5

HIV-1 population as a result of mutation can be seen inEnv and yet, the entry efficiency of HIV-1 viral variants appears to be unchanged, or even possibly enhanced. This observation suggests the presence of conserved structural regions of HIV-1 Env that function independent of the proteomic sequence. Thus, the primary goal of this study is to generate pseudo-type clones of HIV-1 viral variants with diverse Env sequences and alter host cell entry requirements for the purpose of identifying unique Env structural characteristics that regulate host cell expansion. The traditional model of HIV-1 pathogenesis suggests that a high density of surface CD4+ is essential for the successful attachment and entry of the virus into target host cells such as T cells (T-cell tropic) [4]. The discovery of viral variants capable of entering cell types, such as macrophages (M-tropic) with a less dense population of CD4+ on the cell surface, suggests that during viral pathogenesis, the tropic preference of HIV-1 expands to alternative cell types [5]. The ability of the virus to gain entry into alternative cell types is theorized to be regulated by HIV-1 Env because of the proteinâ&#x20AC;&#x2122;s essential role in host cell recognition and attachment. HIV-1 Env is a 160 kD glycoprotein (gp160) located on the surface of HIV1. After translation, Env is cleaved into a gp120 subunit and a gp41 subunit. The gp41 subunit becomes anchored into the viral membrane as a trimeric transmembrane protein. The gp120subunit forms a trimeric spike on the virion surface and is made up of five conserved regions (C1-C5) interspersed with five variable regions (V1-V5). The association of the HIV-1 trimeric spike with host cell CD4+ triggers a conformational shift in viral gp120

Street Broad Scientific

Biology Research from a closed state to an open state. During this highly dynamic process V1/V2 of gp120 are re-oriented for the exposure of V3 domain. The V3 domain functions as the critical determinant of HIV-1 tropic preference through its interaction with host cell co-receptor CXCR5 (R5) or CXCR4 (X4)[4]. In more recent studies on HIV-1 entry phenotype behavior, the V1/V2 domain has been shown to also influence host co-receptor recognition and engagement [6], which is why this study also aims to investigate the structural characteristics of the V1/V2 domain and their functional influence on HIV-1 tropic preference. Additionally, the sequence diversity of the V1/V2 allows the domains to function as constantly evolving immunogenic targets that mask highly conserved regions of Env such as the V3 domain. Consequently, the V1/V2 domain of HIV-1 Env represents an area of needed research. In order to expand understanding of the role of V1/ V2 in host cell association and how its function differs in M-tropic and T cell-tropic viruses, it is necessary to investigate the conformational dynamics of the variable loop domains for M-tropic viruses in relation to T-tropic viruses. Even though the movement of the V1/V2 domain regulates the formation and exposure of the Env co-receptor binding site, not much is known about the biochemical principles associated with the movement of this region and how the Env structure-function relationships change with entry phenotype and tropic preference. The development of viruses with altered receptor/co-receptor usage and viral tropism suggests that a mechanism may exist to allow HIV-1 to evolve within a single host into variants that can infect a variety of host cell tissues, which is extremely important for targeting HIV-1. To further elucidate this mechanism, fluorophore recognizing peptide sequence tags were inserted into the V1 and V4 regions of R5 using HIV-1 Env clonal isolates derived from T cell-tropic and M tropic viruses isolated from the blood plasma and cerebral spinal fluid of a chronically infected subject. In the long-term these tags will be used to conduct single molecule fluorescence resonance energy transfer (smFRET), a technique that measures the extent of non-radiative energy transfer between two fluorescent dye molecules (an acceptor and a donor) to determine the distance between the two molecules based on the ratio of acceptor intensity to total emission intensity [7]. While T cell-tropic HIV-1 viral coat proteins have previously been compare how tropism affects conformational changes within Env during host cell association. Ultimately, this may provide insights into the evolution of viral pathogenesis within the central nervous system and furtherdevelop hypotheses about the biochemical relationships responsible for the expansion of HIV-1 into alternative cell types.

2. Materials and methods 2.1 Isolation of Paired Macrophage-Tropic and T Cell- Tropic HIV-1 Env HIV-1 RNA was isolated from JRCSF blood plasma

Volume 1 | 2011-2012

and CSF samples. Single genome amplification (SGA) of the full-length HIV-1 env gene was then conducted by other researchers in the lab to characterize the isolated viruses [10,11]. Viruses isolated from blood plasma and CSF were identified as paired based on degree of sequence homology and difference in tropic behavior. The SGA amplicons were sequenced from the start of gp120 to the end of gp41 and affinofile cell assays were conducted to determine genetic diversity and tropism of the isolated envelopes [5]. Subject 4059 was identified as paired and envelope clonal isolates from this virus were cloned into the pcDNA 3.1D/V5-His-TOPO expression vector (Invitrogen) using the pcDNA 3.1 directional TOPO expression kit (Invitrogen) and MAX Efficiency Stbl2 competent cells (Invitrogen) as per the manufacturer’s instructions for mutagenesis. 2.2

Construction of Tagged Viruses

2.2.1 Overlap-Extension PCR Cloning Peptides were inserted into the gp120 domain of paired HIV-1 Env clonal isolates using overlap-extension PCR. Peptide Q3 (GQQQLG) was inserted into the V1 loop and either peptide A4 (DSLDMLEW) or peptide A6 (GDSLDM) was inserted into the V4 loop (fig. 1). Insertion sites were chosen to be in regions of the loops that are not conserved and avoided major protein structural features such as beta sheets, glycosylation sites, etc. DNA primers used to clone sequence tags into subject 4059 Env were designed using Sequencher software to contain a single sequence tag, either V1-Q3 (5’-GGCCAGCAACAGCTCGGC-3’), V4-A4 (5’-GAC-TCTCTTGATATGTTGGAGTGG-3’), or V4-A6 sequences (5’-GGAGA CTCTCTTGATATG-3’), flanked by ~20 bp of template DNA upstream and ~40 bp of template DNA downstream of the insertion site. Phusion DNA polymerase (New England BioLabs), primers, and 4059c/ 4059p template DNA were used to create 50µL reactions with 60 ng of template DNA each for the overlap-extension PCR. Each PCR was subjected to the following temperature regimen in a thermocycler: initial denaturation at 100˚C for 2 min, denaturation at 94˚C for 30 s, annealing at 60˚C for 30 s, extension at 68˚C for ~110 s/ kb (15 min.) for 30 cycles, with a final extension at 68˚C for 15 min. All PCR reactions were digested with 2µL of DpnI at 37˚C for 1-2 hours to remove methylatedtemplate DNA and PCR purified (QIAGEN QIAquick PCR Purification Kit), and were eluted twice through the spin column after PCR was completed. Due to the size of the insert and plasmid, the entire reaction from overlap-extension PCR to transformation into bacteria was extremely inefficient and required a lot of modification to standard cloning protocols and testing conditions. Although performing PCR purification after overlap-extension PCR with a large plasmid usually leads to the loss of a large percentage of desired product DNA, we found that transforming afVolume 5 | 2015-2016 | 33

Street Broad Scientific

Biology Research

Volume 1 | 2011-2012

ter only a DpnI digest was too inefficient and led to minimal growth of bacterial colonies on only 50% of the plates. PCR purifying increased PCR product to excess PCR reagents ratio and led to the successful growth of between 10 and 20 colonies on all of the transformation plates. After the PCR purification, 12µL of digested and purified PCR product were transformed with 100µL of MAX Efficiency Stbl2 chemically competent cells (Invitrogen) and plated on LB agar plates containing carbenicillin (carb). As controls, 4059c and 4059p template DNA was transformed and plated on LB agar plates seeded with carbenicillin to positively control for bacteria growth, and untransformed Stbl 2 cells (Invitrogen) were plated on identical plates to negatively control for no bacteria growth. After incubating overnight at 25˚C, multiple colonies from different sections of each plate were selected and miniprepped (QIAGEN QIAprep Spin Miniprep Kit) for Sanger DNA sequencing at the UNC-CH Genome Analysis Facility. V1-Q3 mutants were sequenced with primer F6104 and V4-A4 and V4-A6 mutants were sequenced with primer For15. DNA sequences and chromatograms were analyzed in Sequencher to confirm that the tags had been successfully inserted. Sequences were judged to be properly cloned only if there was no mismatch in base pairs or sequencing errors such as low chromatogram readings over the length of the high-quality sequence. The entire env gene was sequenced from gp120 to gp41 to ensure that the tag had been inserted successfully and that no other regions of the envelope had been affected; full length sequencing of the plasmid was not conducted. 2.2.2 Sequence Tag Ligations After it was confirmed that the sequence tags had been successfully inserted, the singly tagged paired viral envelope isolates were restriction enzyme digested and ligated to contain a pair of tags, either the V1-Q3 tag and the V4-A4 tag or the V1-Q3 tag and the V4-A6 tag. Plasmids were digested by XbaI and BsrGI to create two fragments: one longer, ~6500 bp fragment containing the V1 domain and the V1-Q3 tag known as the vector and one shorter, ~2000 bp fragment contain-ing the V4 domain and the V4-A4 or V4-A6 tag known as the insert. 5µg of plasmid DNA and 50 units/µL ofrestriction enzyme were used to create a 60µL reaction that was incubated at 37˚C for 2 hours to complete the restriction enzyme digest. Immediately afterwards, 50µL of the vector fragment was digested with 2µL of CIP restriction enzyme at 37˚C for 1 hour and PCR purified. The vector and insert were then separated with an agarose gel and the desired bands containing the sequence tags were cut out and gel extracted (QIAGEN QIAquick Gel Extraction Kit). Quick ligation (NEB Quick Ligation Kit) was used to ligate the vector containing the V1-Q3 tag and the insert containing the V4-A4 or V4-A6 tag. 1µL of Quick T4 DNA Ligase was added to 50ng of vector, a 3-fold molar excess of insert, Quick Ligation buffer, and dH2O to 34 | 2015-2016 | Volume 5

create a 20µL reaction volume. The ligation was allowed to proceed for 5 minutes at room temperature and then placed on ice for transformation. 1µL of the ligation product DNA was transformed with 25µL of Stbl2 cells (Invitrogen) and plated on LB agar plates with carb. Multiple colonies were selected from each plate and sequenced at the UNC-CH Genome Sequencing Analysis Facility using Sanger DNA sequencing to confirm that the two tags had been successfully inserted into one envelope gene.

3. Results We successfully inserted peptide sequence tags into the V1 and V4 domains of HIV-1 Env clonal isolates from paired T cell-tropic and M-tropic viruses from a single donor, developing a tool for investigating and comparing the importance of certain structural features of the Env variable loop domains for viral infectivity and protein folding in T cell-tropic and M-tropic viruses. In order to successfully clone the tags and minimize the effect of the mutated envelope on viral pathogenesis, various insertion protocols and conditions, sequence tag lengths, and insertion sites were tested. To measure the effects of different sequence tag lengths and insertion sites on viral infectivity and protein structure of Env, two sequence tags of different lengths were inserted into the V4 domain to determine which tag would be processed into the gp120 domain with the smallest effect on protein structure and the infectivity: the longer A4 sequence tag (DSLDMLEW) or the shorter A6 sequence tag (GDSLDM). We analyzed crystal structures and amino acid sequences of Env to choose insertion sites in highly variable regions that are not conserved and do not result in beta sheets, glycosylation sites, and other major protein structures upon protein translation and folding (fig. 1). The V1 insertion site was chosen based on insertions made in T cell-tropic viruses in previous literature8 and was made at the same residue in 4059c and 4059p as a control to compare the effects of the presence of a sequence tag in this region on protein folding and infectivity in M-tropic and T cell-tropic viruses. V4 insertion sites were similarly chosen to be at the same site in 4059c and 4059p with the least chance of affecting viral function although for the V4-A4 4059p tag, the insertion was chosen farther downstream to avoid any possible interference between the bulkier A4 tag and the glycosylation sites in the region during protein folding. Overlap-extension PCR was also modified to clone the sequence tags V1-Q3, V4-A4, and V4-A6 as simply and efficiently as possible into the 4059 paired viral envelopes. The entire 8.5 kb expression plasmid was replicated during PCR to insert the tags. Once all sequence tags were successfully inserted following optimization of the cloning protocol, the plasmids were restriction enzyme digested to separate the V1 and V4 domains and ran on an agarose gel to confirm the presence of the vector and the insert prior to ligation, which further allowed

Street Broad Scientific

Biology Research

Volume 1 | 2011-2012

us to confirm that overlap-extension PCR was a reliable method for sequence tag insertion. (fig. 2).

Figure 1. Peptide insertion sites in variable loops of gp120 of HIV-1. Insertion sites were chosen in paired 4059 HIV-1JRCSF Env based on criteria described in Materials and Methods. Peptide insertions are indicated in bold; standard Hxb2 numbering is used to label HIV-1 envelope amino acid residues in figure. The peptides inserted at these sites are referred to in the text as V1-Q3, V4-A4, and V4-A6.

Figure 2. Agarose gel electrophoresis results for XbaI and BsrGI restriction enzyme digested 4059 paired viral envelopes containing either V1-Q3, V4-A4, or V4A6 sequence tags. Bands marked in red are labelled with the sequence tag contained within them and were cut out under blue light and gel extracted for quick ligation. The two sequence tags, either V1-Q3 and V4- A4 or V1-Q3 and V4-A6, were then ligated into one plasmid using quick ligation and sequenced for accuracy. Sanger DNA sequencing from the start of the gp120 domain to the end of the gp41 domain confirmed the successful ligation of the two tags into one envelope gene without any errors, although full-length sequencing of the entire plasmid would be the only way to ensure that the rest of the envelope gene and plasmid had been cloned without any polymerase errors from PCR (fig. 3).

Figure 3. Sanger DNA sequencing chromatograms for 4059cQ3-A4, 4059cQ3-A6, 4059pQ3-A4, and 4059pQ3-A6 ligations obtained from UNC-CH Genome Analysis Facility and shown on Sequencher. One properly sequenced dually tagged envelope clone is shown for each paired virus. The nucleotide sequence is shown with the amino acid sequence below and the sections of the envelope containing the sequence tag (V1 and V4) highlighted in red. The chromatograms indicate that the ligations were successful and that sequenc-ing was conducted without any errors.

4. Discussion The mechanism by which HIV-1 evolves to enter and replicate in macrophages is poorly understood despite its significance in HIV-1 viral expansion; we hypothesized that the V1/V2 domain may affect this switch in viral tropism because of its essential role in HIV-1 attachment and entry into host cells. In this study, we developed a mechanism for creating divergent entry phenotype models that will be interrogated for the purpose of obtaining a deeper understanding of the role of the V1/V2 domain in the development of M-tropic viral variants. Two sequence tags were successfully cloned into paired M-tropic and T cell-tropic envelope proteins from a single donor, one tag in the V1 domain and one tag in the V4 domain. These sequence tags are significant because they provide the basis for future studies on the HIV-1 envelope conformational dynamics using, smFRET. This technique is essential for visualizing the movement of the V1/V2 domain during host cell association, allowing us to create a dynamic profile of the M-tropic and T cell-tropic gp120 in the future. In previous literature, a Q3 peptide (GQQQLG) Volume 5 | 2015-2016 | 35

Street Broad Scientific

Biology Research

Volume 1 | 2011-2012

and A1 peptide (GDSLDMLEWSLM) were successfully inserted into the V1 and V4 loop, respectively, to produce a movement profile of the variable loops [8]. However, those tags were only cloned into HIV-1 T cell-tropic envelope proteins. We optimized the overlap-extension PCR cloning protocol and successfully inserted sequence tags into both T cell-tropic and M-tropic viruses. These tags provide a platform to not only understand conformational dynamics of V1/V2 in M-tropic viruses, but also elucidate how the V1/V2 movement differs between T cell-tropic and M-tropic viruses. Additionally, we also chose to test the insertion of an A4 peptide (DSLDMLEW) and an A6 peptide (GDSLDM) into the V4 domain rather than an A1 peptide as was done previously, allowing us to confirm that the length of the tag does not affect the efficiency of the insertion protocol. This discovery ex- pands the flexibility of the protocol and our ability to test sequence tags of varying lengths within the gp120 domain. Furthermore, this method will also help expand understanding of the relationship between protein structure and function of Env. Mutating different regions of the viral envelope to contain peptide sequences may alter structural features of the gp120 domain to interfere with proper protein folding and function. Pep-tide insertion sites were initially chosen to be in regions of the variable loop domains that were not conserved and to avoid glycosylation sites. A failure for the tag to insert properly or for the mutated envelope to func tionnormally may be due to the disruption of a residue within the env gene essential for HIV-1 infection. Selecting additional insertion sites would allow us to examine how the structure of the V1/V2 domain is related to the expression of conserved structural epitopes for host cell association that can later be used for therapeutic discovery. In general, our results and observations are consistent with previous findings that indicate that singly and dually tagging viral envelopes is a valid method for investigating changes in HIV-1 pathogenesis, specifically in the variable loop domains of the HIV-1 envelope protein. As the sole regulator of HIV-1 attachment and entry into host cells, the envelope protein is a major therapeutic target. Thus, it is essential to understand as much as possible about Env structure-function relationships and the biochemical mechanisms associated with switches in HIV-1 viral tropism. By successfully inserting two sequence tags into the gp120 domains of paired viruses though, we allow for the creation of high-resolution dynamic models of the V1/V2 domain for both M-tropic and T cell-tropic HIV-1 viruses that will ultimately increase our understanding of HIV-1 viral expansion.

the complex structure and conformational shifts of the envelope protein. We successfully managed to clone sequence tags of varying lengths into the viral envelope of both T cell-tropic and M-tropic viruses, indicating that smFRET imaging can also be applied to M-tropic viruses to develop movement profiles. These findings serve as an important step toward developing a wider variety of novel techniques for understanding compartmentalized HIV-1 macrophage-tropic viruses and the role of viral tropism in the expansion of viral pathogenesis in vivo. While the presence of the sequence tags was confirmed through Sanger DNA sequencing of the mutated gp120 and gp41 domains, the effect of the inserted tags on the protein structure of Env and the infectivity of the virus still remains to be assessed. To assess infectivity of the mutated viral envelopes, the singly and dually tagged viral envelopes will be co-transfected with the NL4-3 viral backbone and a reporter gene for the fluorescent protein luciferase to produce pseudoviruses, viruses lacking the enzymes necessary for reproduction.Transfection of the tagged viral envelope plasmids will also ensure that the tagged Env proteins will still be transcribed normally into gp120 and gp41 and incorpo- rated successfully into virions despite the peptide insert. Those pseudoviruses will then be evaluated for biologi-cal relevance using luciferase assays that will quantify the infectivity levels of the mutated virions and compare them to the infectivity levels of wild type HIV-1. To do this, the pseudoviruses will be incubated with a certain concentration of TZM-bl cells also containing the luciferase reporter gene. The resulting fluorescence from the luciferase reporter gene will be assessed using a luminometer to determine the amount of luciferase protein produced in vivo as a measurement of infectivity. This infectivity measurement will allow us to determine how the sequence tags affect the ability of the virus to infect host cells. Once the infectivity levels of the dually tagged mutated envelopes are optimized to be as close to wild type infectivity levels as possible, those envelopes will be used in single molecule fluorescence resonance energy transfer (smFRET) imaging to create movement profiles for the viruses. Donor and acceptor fluorophore insertion and smFRET will be conducted as described in the literature [7,8] to monitor movement of the V1/V2 and V3 domains in the two R5 T cell-tropic and M-tropic viruses. Data from smFRET will ultimately be used to characterize the conformational dynamics of the variable loop domains and compare them in HIV-1 R5 viruses with different tropisms.

5. Conclusions

I would like to thank Dr. William D. Graham and Dr. Michael J. Bruno for their invaluable support, advice and guidance throughout this research project. I am also grateful to Dr. Ronald Swanstrom and the Swanstrom Lab at the UNC-CH Lineberger Comprehensive Can-

The data presented here demonstrate that singly and dually tagging paired HIV-1 Env clonal isolates via overlap-extension PCR is an effective means for investigating 36 | 2015-2016 | Volume 5

6. Acknowledgements

Biology Research

Street Broad Scientific Volume 1 | 2011-2012

cer Center for providing the facilities and equipment for the experimentation and research. Finally, I would like to thank the Research in Chemistry program at the North Carolina School of Science and Mathematics for providing this research opportunity to me.

7. References [1] WHO. WHO | HIV/AIDS Fact Sheet. (2014). at <> [2] Turner, B. G. & Summers, M. F. Structural biology of HIV. J. Mol. Biol. 285, 1–32 (1999). [3] Marlink, R. et al. Reduced rate of disease development after HIV-2 infection as compared to HIV-1. Science (80-. ). 265, 1587–1590 (1994). [4] Arrildt, K. T., Joseph, S. B. & Swanstrom, R. The HIV1 env protein: a coat of many colors. Curr. HIV/AIDS Rep. 9, 52–63 (2012). [5] Arrildt, K. T. et al. Phenotypic Correlates of HIV1 Macrophage Tropism. J. Virol. (2015). doi:10.1128/ JVI.00946-15[6] Pastore, C. et al. Human immunodeficiency virus type 1 coreceptor switching: V1/V2 gain-offitness mutations compensate for V3 loss-of-fitness mutations. J. Virol. 80, 750–8 (2006). [7] Roy, R., Hohng, S. & Ha, T. A practical guide to single-molecule FRET. Nat. Methods 5, 507–16 (2008). [8] Munro, J. B. et al. Conformational dynamics of single HIV-1 envelope trimers on the surface of native virions. Science 346, 759–63 (2014). [9] Schnell, G., Spudich, S., Harrington, P., Price, R. W. & Swanstrom, R. Compartmentalized human immunodeficiency virus type 1 originates from long-lived cells in some subjects with HIV-1-associated dementia. PLoS Pathog. 5, e1000395 (2009). [10] Salazar-Gonzalez, J. F. et al. Deciphering human immunodeficiency virus type 1 transmission and early envelope diversification by single-genome amplification and sequencing. J. Virol. 82, 3952–70 (2008). [11] Schnell, G., Joseph, S., Spudich, S., Price, R. W. & Swanstrom, R. HIV-1 Replication in the Central Nervous System Occurs in Two Distinct Cell Types. PLoS Pathog. 7, e1002286 (2011).

Volume 5 | 2015-2016 | 37

Street Broad Scientific

Biology Research

Volume 1 | 2011-2012

Comparison of Support Vector Regression Models of Transcription Factors E2F1 and E2F4â&#x20AC;&#x2122;s Binding Specificities to DNA Sequences Sunwoo Yim ABSTRACT Currently, there is a lack of data on the differences in binding specificities between transcription factors (TFs) with highly similar structural domains. This study focused on human TFs E2F1 and E2F4 by using support vector regression (SVR) to train models from genomic-context protein binding microarray (gcPBM) data. These models were analyzed in such a way that the significant featuresâ&#x20AC;&#x2122;s weights could be extracted. Analysis of the most significant features of each model showed that nucleotide interdependency in the six-base long flanks on either side of the core significantly contributed to the binding preference of both TFs. The models were also compared by core and by TF, and the features with the greatest deviation between pairs of models were studied. E2F1 compared to E2F4 had a higher specificity for A and T trinucleotides in the flanking regions of the core binding sites and the preferences of E2F1 were more easily predicted by sequence features than those of E2F4. Finally, comparison between preferences for different cores demonstrated that the SVR model had higher accuracy when predicting sequences with nucleotide cores consisting of GCGC as compared to those consisting of GCGG.

1. Introduction Although every cell in the human body contains the exact same DNA, groups of cells express the DNA information differently and thus play different roles in the body. Cells become skin cells rather than blood cells or cancerous cells rather than healthy ones largely due to transcription factors. TFs bind to specific DNA sites and regulate the process of transcription, thus controlling expression of genetic information (Figure 1). Because TFs play such a major role in deciding how information is decoded in the body, the need to understand TF binding specificities to DNA sequences has become essential in genomics and bioinformatics.

Figure 1: Transcription factors bind to DNA sites to regulate adjacent genes. The genes are then transcribed and translated into proteins [1]. Out of the thousands of TFs found in the human genome, TFs E2F1 and E2F4 were chosen as the focus of the project because they play vital roles in the human body by regulating cell proliferation and apoptosis [9]. As a result, they are crucial during the loss of retinoblastoma 38 | 2015-2016 | Volume 5

(Rb) tumor suppressor function, which can lead to uncontrolled malignant cell growth and thus human cancer [11]. Much research that aims to discover patterns in the sequences bound by specific TFs such as E2F1 and E2F4 are based on models such as position weight matrices (PWMs) [8]. However, PWMs fail to account for any interdependencies between nucleotides because they show the relative frequencies of each nucleotide at each position in the sequences bound by TFs, independent of the other positions. This can lead to incorrect interpretations because, by nature, the structures of these proteins lead to complex interactions with DNA sequences, often involving multiple nucleotides in the binding process. Other studies support the need to include nucleotide relationships as models that also include 2-mer or 3-mer features perform significantly better when predicting TF binding specificities for DNA sequences [7]. This is further supported when looking at the PWMs for E2F1 and E2F4, which show little difference in their preferred sequences (Figure 2A). However, the data from chromatin immunoprecipitation with DNA sequencing (ChIP-seq) in vivo shows a much different story, with the majority of the sequences being bound differently by the two TFs (Figure 2B). This suggests that other factors such as nucleotide interdependency comes into play during the TF binding process. This research project aimed to explore which discrepancies in the sequences bound by TFs E2F1 and E2F4 led to these proteins binding so differently to DNA despite sharing very similar protein structures.

2. Methods and Materials The raw experimental data for the binding specificity

Biology Research

Figure 2: (A) DNA binding motifs (from Transfac [4]) represented with PWMs (B) in vivo DNA binding data (from ENCODE [5]) for E2F1 and E2F4 showing little shared sequences for TFs E2F1 and E2F4 used in this project contained thousands of 36 base long DNA sequences and their respective log signal intensity scores. A laboratory staff experimentally produced the data for E2F1 and E2F4 using genomic-context protein binding microarrays, or gcPBMs [7], which were also developed in the lab based off of uPBMs, or universal PBMs [2]. The gcPBMs were used to determine relative TF binding affinities to the tested DNA sequences and were preferred over uPBMs because the synthesized DNA sequences contained genetic context and were shown to simulate in vivo results more closely. Then, support vector regression (SVR), a form of machine learning, was used to train a model from the experimental data and predict the binding specificity scores for new DNA sequences [10]. So, a library for support vector machines (LIBSVM) was adopted [6] and its Java code altered to analyze the experimental data [3]. 2.1 Process and filter raw gcPBM data Data from the gcPBM for both TFs was processed before being filtered. First, all the sequences were made 34-mers by cutting off the “A” or “T” nucleotide that had been attached to each end of the sequence to facilitate the primer double-stranding process. Then, the reverse complement of all the sequences with cores CCGC was taken to turn them into sequences with core GCGG. Each sequence also contained two orientations as the PBM tested multiple sequences such that both sides of the sequence were attached to the microarray’s glass slide. The best orientation score was chosen to represent each unique sequence. Lastly, because the data contained duplicate sequences with different scores, the duplicated sequence was assigned the median of the log signal intensity scores. The median was used rather than the mean because the experimental data could have produced outliers, making the mean impractical because it could have skewed the data. After completing the processing steps discussed above, the data was selected such that it satisfied certain conditions. First, the selected sequences had to have a GCGC or GCGG core. This was done to provide better data as E2F1 and E2F4 are known to bind well to sequences with those cores. In addition, only the sequences in which the absolute value of the difference

Street Broad Scientific Volume 1 | 2011-2012

in orientation was lower than a cutoff score were chosen. The cutoff score was found by testing several values and picking the one that produced the highest correlation coefficient (R2) values in the model. Then, the sequences that contained sequences of “GCGC” or “GCGG” in the farthest 11-mer flanks on either side were taken out to ensure that the TFs would bind at or near the core and not in the flanking regions. This was done so that the TF would not accidentally bind in the farthest flanks and thus misrepresent the sequence with its log signal intensity score. 2.2 Partition and format the data The processed PBM data for each TF was then randomly shuffled and partitioned into training sets (80% of data) and testing sets (20% of data). This process was repeated to produce ten different sets of training and testing data. All of the data was then converted into the LIBSVR format by transforming each sequence into both 1-mer and 3-mer features and ordering them in the way described by other papers [7]. However, because the data was subselected and tested separately based on the GCGC core and the GCGG core, the features in the core that were the same for every sequence, specifically positions 16-19 for 1-mers and 16-17 for 3-mers, were ignored to prevent those features from being overrepresented in the models. This process resulted in features 1-120 for the 1-mer features and features 121-2040 for 3-mer features. Because these ten datasets were later subselected based on cores GCGC and GCGG, this resulted in ten smaller datasets for TF E2F1 Core GCGC, TF E2F1 Core GCGG, TF E2F4 Core GCGC, and TF E2F4 Core GCGG, or a total of forty datasets. 2.3 Training, grid search, cross validation, and prediction The support vector regression training model took two parameters: the cost variable (c) to penalize deviations from the model and the epsilon variable (p) to set the accepted and unpenalized deviation of each vector from the model. The optimal parameter settings (c, p) was unique for each dataset, and so the best pairing had to be found using a grid search method. A coarse grid search was first done by testing every combination of c values consisting of 2−9, 2−8, 2−7, ..., 2−2 and p values consisting of 2−7, 2−6, ..., 2−1, 1, and then a fine grid search was done by zooming in on the area with the best R2 values and repeating the grid search with new c and p values. For each (c, p) pairing, the R2 value was found by doing a 5-fold cross validation using the training set. After both the coarse and fine grid searches, the parameter pairing that produced the highest R2 value and thus the closest prediction of the TF binding specificity was used to train the entire training set, resulting in the final prediction model. This model was then tested for accuracy by predicting the binding specificities of the testing data and comparing Volume 5 | 2015-2016 | 39

Street Broad Scientific Volume 1 | 2011-2012

Biology Research

the results with the experimental log signal intensity scores of the testing data, thus producing a R2 value that would represent the entire modelâ&#x20AC;&#x2122;s accuracy. 2.4 Finding feature weights The model could then be interpreted to find patterns in the sequences each TF chose to bind to. To do this, the feature weights, showing the relative importance of each feature in the model were extracted using the matrix equation

where yk was the SVR model weight of the kth sequence (k = 1 to s, with s = the number of support vevtors in the model), vk,l was binary value of 0 or 1 showing whether the lth feature was present in the kth sequence, and wl was the outputted weight of the lth feature. Each of the four categories tested for TFs E2F1 and E2F4 and their cores GCGC and GCGG contained ten unique datasets and thus ten models, each with its own feature weights. So, the feature weights across all ten models were averaged for each category, and the standard deviation of each feature was calculated.

Figure 3: Each 3-mer feature weights for both cores in TF E2F1 were plotted in a clustered column graph. The error bars across all 10 iterations are shown for each feature weight.

2.5 Graphs of feature weights for the E2F1 and E2F4 SVR models The averaged feature weights for TF E2F1 Core GCGC, TF E2F1 Core GCGG, TF E2F4 Core GCGC, and TF E2F4 Core GCGG were each grouped by 1-mer and 3-mer features and plotted on clustered column graphs (Figures 3 & 4). The standard deviation for each feature weight was then shown through error bars. To find the most significant features for each model, a cutoff score was calculated by taking the highest feature weight and dividing it by 2. The features that had a weight above this cutoff were then selected and shown for both the 1 and 3-mer graphs (Figure 5). 2.6 Comparison graphs of feature weights between cores and between TFs In addition to finding the significant features for each model, pairs of models were compared to each other to find the features with the greatest difference in weight between models. Using these graphs to compare E2F1 and E2F4 models, the most important differences in the sequences bound by these TFs could be identified. First, as the training sequences for each model was filtered with different cutoffs in orientation differences, each set of data was normalized to ensure that disparate feature weight meanings did not prevent an accurate method of comparison. So, all the weights were normalized such that each normalized weight would be contained in the interval -1 to 1 using the equation where nk represents the normalized feature weight of the unnormalized weight wk 40 | 2015-2016 | Volume 5

Figure 4: Each 3-mer feature weights for both cores in TF E2F4 were plotted in a clustered column graph. The error bars across all 10 iterations are shown for each feature weight. and wmax represents the maximum feature weight. This way of normalizing the feature weights for the E2F1 and E2F4 SVR models was used so that an unnormalized weight of 0 would still remain 0 after normalization, avoiding any nonzero weight being given to the corresponding feature and thus preventing false impressions that the feature had an impact on the model. After the normalization, each feature was graphed using R on a scatter plot with one model graphed on the x-axis and the other on the y-axis. The standard deviations for each point was then graphed in both the x and y directions (Figure 6). The very small

Street Broad Scientific

Biology Research error lines for each point shows that the feature weights are stable, illustrating that the data off of which any analysis done is extremely accurate. Then, the line y = x was drawn. The line y = x was used as a basis for comparing the variations in the two models. If each feature in both models theoretically weighed the same, then each plotted point would be on the y = x line. As the majority of the points did not lie on the line, this was not the case, and so the features with the greatest distance from the line and thus with the greatest variance in weight were

Volume 1 | 2011-2012

found (Figure 7). After the side of the line on which the points were located was determined, the features could be grouped based on either core or TF.

Figure 7: The most significant features for Figure 6 were taken based on distance from the y = x line. The name column displays the side of the line on which the feature was located. The core area is highlighted yellow. Using this process, figures could be created for E2F1 Cores GCGC vs. GCGG (Figure 6A & 7A), E2F4 Cores GCGC vs. GCGG (Figure 6B & 7B), GCGC TFs E2F1 vs. E2F4 (Figures 6C & 7C), and GCGG TFs E2F1 vs. E2F4 (Figures 6D & 7D). Figure 5: The most significant 3-mer features from the models shown in Figures 3& 4 are displayed. The core area, from position 16 to 19, is highlighted in yellow.

Figure 6: The feature weights were compared in a scatter plot to compare two different models. The line y = x was then plotted to highlight disparate feature weights between both models.

3. Results and Discussion 3.1 Significant features in the SVR models The During the protein-DNA binding process, the area of focus was primarily the core of the sequence and its closest surrounding nucleotides, and so the distant flanks had less of an influence on the TF binding specificity. The most significant features for all four models supported this as they were on the six-base-long flanks on either side of the core, suggesting that these regions had the greatest influence on the binding preferences of TFs (Figure 5). In addition, because the sequences bound by E2F1 and E2F4 intersected to some extent (Figure 2B), the important features for these TFs were expected to overlap. The data supported this as both E2F1 and E2F4 preferred sequences with A and T trinucleotides (Figure 5). However, E2F1 had a higher specificity for these trimers than E2F4, indicating that sequences with these trinucleotide features were expected to show greater binding potential to E2F1. In addition, the lower R2 scores found for E2F4 models compared to E2F1 models indicated that E2F1â&#x20AC;&#x2122;s binding specificities were better predicted by sequence information alone than those of E2F4. The reasons for this are still unclear but may be related to the presence of variables such as cofactors in the E2F4 binding process. Thus, without these additional factors included in the models, the ability to predict E2F4â&#x20AC;&#x2122;s bound sequences is Volume 5 | 2015-2016 | 41

Street Broad Scientific Volume 1 | 2011-2012

diminished, making the presence of A and T trinucleotide patterns have less of an effect. 3.2 Comparison of feature weights for GCGC versus GCGG The deviance of the feature points from the y = x line in the graphs comparing cores (Figures 6A & 6B) can be explained by the nature of the nucleotide interactions. The different cores of the GCGC and GCGG sequences impacted the bonding structures of the flanks, and so the important features were slightly different for each core. If models had been constructed without this distinction of cores, the features containing only the core would not have been ignored in the model, resulting in overrepresented and disproportionately high weights for those features. Thus, the feature points for the graphs comparing cores were expected to weakly correlate with the y = x line. Of the features that were farthest from this line, the majority contained nucleotides inside the core (Figures 7A & 7B). This suggested that the different cores themselves had the greatest effect on feature differences, most likely because of their high representation in the sequences, thus showing that there was little difference in the actual flanking preferences between cores. In addition, the majority of the features used in the model had a weight of 0 because 3-mer features accounted for 43 = 64 different combinations of nucleotides for every position, and so most features were expected to be represented only a few times in the training sequences. Therefore, the SVR model would not have had enough information to accurately place a weight for these features, resulting in a very low weight close to 0. 3.3 Comparison of feature weights for E2F1 versus E2F4 As with the graphs comparing cores, the graphs comparing TFs showed the majority of feature weights clustered around 0 (Figures 6C & 6D). However, the feature points seemed to lie closer to the y = x line, with fewer outliers compared to the graphs comparing cores. In addition, the farthest outliers proved to be mostly A and T trinucleotides, which were more specific to E2F1 than to E2F4 (Figures 7C & 7D). This pattern was also more pronounced for sequences with core GCGC than for those with core GCGG. 3.4 Reliability and statistical errors To ensure the highest degree of accuracy in the feature weights, the models needed to accurately predict the log signal intensity scores for new testing sequences. When averaging the R2 values for all 10 models, we found very high R2 values of 0.895 for TF E2F1 Core GCGC, 0.810 for TF E2F1 Core GCGG, 0.799 for TF E2F4 Core GCGC, and 0.743 for TF E2F4 Core GCGG. So, the models were very reliable as they displayed high prediction accuracy, confirming the validity of the feature 42 | 2015-2016 | Volume 5

Biology Research weights. In addition, the feature weights were very stable as shown by the very small standard deviation error bars for each weight (Figures 3 & 4). In addition, the comparison graphs in Figure 6 show very small error bars on both the x-axis and y-axis, while the patterns that were observed from the graphâ&#x20AC;&#x2122;s features also corroborated the observations made from the significant features of the models, thus validating the results. This also enhanced the conclusions of studies in current literature on human TFs, which also demonstrated that TFs showed greater preference to sequences with A and T stretches [8]. As only 1-mer and 3-mer features were used in the model, the information gained from the sequences was slightly limited. Although trinucleotides were observed, it was hard to say whether they were part of larger homogenous motifs. Indeed, the significant features from the models showed that identical trinucleotides overlapped slightly in the flanks, pointing to potentially larger sequences of identical nucleotides. However, including k-mer features larger than the 3-mers would have required a much larger training set to accomodate the exponentially increased set of features. As the processing and filtering of the raw PBM data limited the accepted data size, this was not tested. 3.5 Implications of research The Human Genome Project (HGP) has mapped the entire human genome to extract the functions of DNA sequences within the human body. Given TFs with similar protein structures, the conclusions from this project will differentiate the sequences that each TF binds to. This is crucial as even TFs in the same family have slightly different functions in the body. Studies have shown that although both E2F1 and E2F4 play crucial roles in cell proliferation, E2F4, unlike E2F1, does not induce apoptosis and leads to tumors with decreased latency and increased frequency [11]. Despite this, much of current literature groups these TFs together and fails to provide insight into their different binding preferences. By analyzing the features of genomes, the binding affinities of E2F1 and E2F4 to DNA sites can be determined, potentially leading to important predictions in the personâ&#x20AC;&#x2122;s likelihood of growing benign or malignant tumors. Implications of this include sequencing a personâ&#x20AC;&#x2122;s genome to reveal the probabilities of the person genetically acquiring certain diseases or cancers. The capability of detecting the specific TF rather than just the TF family from the genome would then lead to more accurate indications of each prognosis.

4. Conclusion and Future Work Although many studies aim to find the binding preferences of whole families of TFs, there is a lack of data on individual TFs within a particular family. This project found that TF E2F1 had a greater preference for sequences with A and T stretches in the flanking regions than TF E2F4. The flanks that were six bases on either

Biology Research side of the core especially influenced the outcome of binding for these two TFs. In addition, E2F4 was shown to be less predictable than E2F1 due to variables such as cofactors. As the significant features for each model as well as that for the comparison graphs were analyzed in conjunction, the conclusions made were sound and supported by multiple sources. The standard deviations were low and the reliability high, thus ensuring the validity of the data. Many of the conclusions made for this project applied to TFs in different families from different studies as well. Further studies should look at additional methods of comparison for the data to either find new observations or corroborate established ones. In addition, other TFs in the E2F family such as TF E2F2 or E2F3 should be added to the study in order to determine if the observations made extend to other proteins in the same family. The research should also broaden its scope to different families of TFs to discover more general conclusions. Additional areas for research include why these TFs prefer A and T stretches in the flanks and how to better distinguish which sequences bind to which TFs in the E2F family without the use of the models. One change in the methods would have been to normalize the log signal intensity scores to allow a better comparison of the data. So, by creating a computational program to compare the feature weights for TFs E2F1 and E2F4, the most significant variations in the bound sequences were be found. Finding patterns in these feature differences then leads to critical conclusions as to why TFs sharing very similar structural domains fail to show analogous binding preferences to the same genomic DNA sequences.

Street Broad Scientific Volume 1 | 2011-2012

elements. [6] Chih-Chung Chang et al. Libsvm – a library for support vector machines. [7] Gordân Raluca et al. Genomic regions flanking E-box binding sites influence DNA binding specificity of bHLH transcription factors through DNA shape. Cell reports, 3(4):1093–104, April 2013. [8] Jolma et al. DNA-binding specificities of human transcription factors. Cell, 152(1-2):327– 339, 2013. [9] GeneCards. E2f transcription factor 1, 2008. [10] Alex J Smola, Bernhard Sch, and B Scholkopf. A Tutorial on Support Vector Regression. Statistics and Computing, 14(3):199–222, 2004. [11] D Wang, J L Russell, and D G Johnson. E2F4 and E2F1 have similar proliferative properties but different apoptotic and oncogenic properties in vivo. Molecular and cellular biology, 20(10):3417–3424, 2000.

5. Acknowledgements The Gordan lab at Duke University provided me with the experimental data and a computer to run the machine learning program. My mentor, Dr. Raluca Gordan, offered weekly suggestions concerning my project and suggested useful reading materials.

6. References [1] Science 2.0. Three waves of innovation in vertebrate evolution, 2011. [2] Michael F Berger and Martha L Bulyk. Universal protein-binding microarrays for the comprehensive characterization of the DNA-binding specificities of transcription factors. Nature protocols, 4(3):393–411, January 2009. [3] Chih-Wei Hsu, Chih-Chung Chang and ChihJen Lin. A Practical Guide to Support Vector Classification. BJU international, 101(1):1396–400, 2008. [4] BIOBASE Biological Databases. Biobase transfac professional. [5] ENCODE. Encode: Encyclopedia of dna Volume 5 | 2015-2016 | 43

Street Broad Scientific

Physics and Engineering Research

Volume 1 | 2011-2012

Gravity Wave Disturbances in the F-region Ionosphere Above Large Earthquakes Margie Bruff


We studied the direction of propagation, duration and wavelength of atmospheric gravity wave (AGW) disturbances in the ionosphere above large earthquakes using data from the Super Dual Auroral Radar Network. We plotted ground scatter power against range and time to identify AGWs as alternating focused and de-focused regions of radar power in wave-like patterns. We analyzed the wave patterns in the weeks around the times of major earthquakes to determine the directions of propagation and wavelengths. To exclude waves caused by geomagnetic activity we considered conditions 48 hours before and after each identified disturbance. We found non- geomagnetic disturbances preceding all six earthquakes for which data were available and succeeding four of the six earthquakes. AGWs travelled in at least two directions away from the epicenter in all cases, and stronger patterns were found for two earthquakes. On average, AGWs appeared 4 days before earthquakes, persisting 2-3 hours; and 1-2 days after, persisting 4-6 hours. Most wavelengths were between 200-300 km. We show a possible correlation between magnitude and depth of earthquakes and AGW patterns, but further study is required. Our results provide a better understanding of the dynamics of propagation of AGWs and have potential applications for predicting earthquakes.

1. Introduction 1.1

Gravity Waves

The goal of this study was to identify patterns in the direction of propagation, duration and wavelength of gravity wave disturbances in the ionosphere near large earthquakes (earthquakes). Gravity waves are waves in a fluid for which the restoring force is gravity or buoyancy. When gravity waves occur adjacent to a boundary, energy is transferred perpendicular to the direction of propagation such that the waves are sustained beyond the boundary [1]. This effect is shown in figure 1 where the boundary is indicated by a dotted red line and wave fronts are indicated by the “L” (low, red) and “H” (high, blue) labels.

Figure 1. Gravity wave [1] 1.2 Sources of AGWs The gravity waves considered in this study are atmospheric gravity waves (AGWs). AGWs are waves in the neutral atmosphere initiated by thelocalized vertical displacement of a region of air. AGWs are most commonly caused by severe weather systems, winds over mountainous terrain, and geomagnetic sources [2]. In regions with mountainous terrain, wind traveling over mountains can be forced upward. Buoyancy acts as 44 | 2015-2016 | Volume 5

a restoring force to cause disturbed air to oscillate up and down. The wave energy is transferred both horizontally in the direction of propagation and vertically across atmospheric boundaries (as shown in figure 1) associated with rapid changes in density, such as the tropopause, mesopause and stratopause. Due to the overall trend for density to decrease with altitude, the wave increases in amplitude as energy is transferred vertically. Once the AGW reaches nearly the altitude of the ionosphere (about 60 km above sea level), the oscillating neutral gas particles collide with plasma in the ionosphere and cause a chain of waves called a traveling ionospheric gravity wave disturbance [1]. AGWs can also be caused by tropospheric phenomena in severe weather systems, including many different circumstances where stable, cold air is displaced. Since the cool air is denser than surrounding warm air, buoyancy acts a restoring force to set the air into vertical oscillation. Similar to the topographical effects described above, the waves will be sustained across boundary layers of the atmosphere and collide with plasma in the ionosphere to result in traveling ionospheric gravity wave disturbances. Low pressure systems are the most common tropospheric phenomena that allow this to happen [2]. Figure 2 shows an air jet stream forced around a low pressure system.The displaced jet of air, shown by the light blue arrow in the figure, interacts with the system’s fronts and is set into oscillation. Charged particles from the solar wind can reach Earth’s magnetic field and enter the lower ionosphere through field aligned currents called the auroral electrojet [3]. The figure below shows the electrojet over the polar and auroral regions during geomagnetic storm conditions. The color scale shows current density in microamps per square meter. High levels of solar activity can cause non-continuous bursts of plasma to escape the electrojet. These bursts cause AGWs by Joule heating, which occurs when

Street Broad Scientific

Physics and Engineering Research

Volume 1 | 2011-2012

4) [5].

Figure 2. Low pressure system as a tropospheric source of gravity waves; 500 millibar isobars are shown in black and the 300 millibar jet stream in light blue [2]

Figure 3. Auroral Electroject during geomagnetic storm conditions [4] collisions of charged particles with the medium result in an energy transfer that causes heating in the local region of the medium. When a strong burst of plasma enters the ionosphere, the effect of Joule heating is strengthened and the local medium is disturbed, resulting in AGWs. These AGWs differ from those caused by topographic mechanisms in that these are impulse single wave forms rather than long chains of waves. The propagation patterns of these waves in the atmosphere is affected by wind speed and direction as well as the intensity and direction of the auroral electrojet [3]. 1.3

Previous Studies of AGWs from Earthquakes

Li et al. used GPS to study the total electron content (TEC) in the ionosphere before the May 12, 2008 Wenchuan earthquake. Since regions of high electron density deflect GPS signals, TEC can be derived from the difference between expected and observed signal delay. The study showed that TEC decreased to the east of the epicenter and increased to the southeast three days before the earthquake for a two hour period (as shown in figure

Figure 4. TEC from May 9, 2008 over the epicenter (black dot in each frame) of the Wenchuan 2008 earthquake measured by GPS stations (grey dots) [5] The study cited the theory by Hegai et al. that AGWs could occur before strong earthquakes because of the Lithosphere-Atmosphere-Ionosphere (LAI) coupling mechanism. The mechanism explains that the electromagnetic radiation emitted before and during seismic activity produces an electric current system that inverts the local electric field. This creates a region of positive electric potential at the top of the atmosphere. Variations in total electron content (TEC) occur when the particles released from the lithosphere are trapped in this positive well [6]. Similarly, a GPS TEC study of the Tohoku 2011 earthquake by Galvan et al. found that AGWs propagated outward from the epicenter withapproximately the same speed and wavelength as the tsunami waves on the oceanâ&#x20AC;&#x2122;s surface. The study concluded that the surface acoustic waves generated by the earthquake caused initial, short duration AGWs, while the tsunami caused AGWs that took longer to reach the ionosphere [7]. 1.4

Radar Observation of Gravity Waves

Both studies described above utilized GPS TEC data to observe ionospheric activity. This method, however, is limited by the location of GPS stations and the movement of satellites and has large sources of error when the signal travels through regions of varying atmospheric conditions. Stationary ground based radar can more easily and accurately detect patterns of wave disturbances in the ionosphere. The Super Dual Auroral Radar Network (SuperDARN) is a network of ground based radar that uses high frequency (9-18MHz) radar signals to study plasma irregularities in the F-region, (uppermost region) of the ionosphere. The network consists of 34 active radars at high latitudes operated by 13 different universities in 11 countries. Gravity wave disturbances can be observed by a method described by Frissell et al. Radar signals sent into the ionosphere are reflected to the ground in alternating regions of focused and de-focused power. The signals are Volume 5 | 2015-2016 | 45

Street Broad Scientific Volume 1 | 2011-2012

focused by the concave regions of wave disturbances and de- focused by the convex regions (figure 5) [8]. These signals are then scattered from the ground back into the ionosphere where they are reflected back to the radar. The signal that returns to the radar after this process is called ground scatter. Thus, AGWs can be identified by the wavelike pattern shown in figure 6 on a range-timeintensity (RTI) plot of ground scatter power [9]. This plot shows a spectrum of power versus time (x-axis) and distance from the radar (y-axis).

Figure 5: Focusing effect of a wave disturbance [9]

Figure 6: Power RTI plot for gravity wave disturbance observed by the Goose Bay Radar [9] 1.5

Hypotheses and Motivation of Current Study

Based on previous studies by Li et al. [5] using GPS TEC observations as well as the model developed by Hegai et al. [6], we expected that gravity wave disturbances would travel radially away from the epicenter both before and after a large earthquake. Wave disturbances would have a period around half an hour and would last 3-5 days before the earthquake. Those from larger magnitude earthquakes would appear sooner than those from smaller magnitude earthquakes, because stress would have built up longer and thus the vertical electric field proposed by Hegai et al. [6] would likely have a greater effect sooner. Similarly, if an earthquake has a larger magnitude, gravity wave disturbances caused by the earthquake would last longer after the earthquake (2-48 hours), because more stress buildup will lead to more Joule heating and larger amplitude of gravity wa ves that will, with the same atmospheric damping effect, take longer to return to equilibrium. This research is significant toward gaining a better understanding of the interaction between the lithosphere, atmosphere and ionosphere. Knowing how gravity wave disturbances propagate from a point source, such as an 46 | 2015-2016 | Volume 5

Physics and Engineering Research earthquake, could greatly expand our understanding of the LAI coupling model. Furthermore, observing wave patterns before an earthquake could increase our understanding of how gravity waves are created by heating and could eventually lead to the ability to predict earthquakes by monitoring gravity wave disturbances in the ionosphere.

2. Materials and Methods The USGS Earthquake Archive Search [10] and SuperDARN Radar Finder and Data Inventory [11,12] were used to identify earthquakes that fit the following criteria: local Richter magnitude greaterthan or equal to 6, epicenter located in the Northern hemisphere, in or near the field of view (FoV) of one or more SuperDARN radars, and time of occurrence during the operation of the radars. Since fitex data, that is, the ground scatter data fitted to obtain power, line of sight velocity, and spectral width, are only accessible from 2006-2015, only earthquakes during this range of years were considered. Data was accessed through the SuperDARN Data Visualization Python Toolkit, or DaViTpy [13], a database and plotting system designed to facilitate easier access and visualization of SuperDARN and other space physics data. All steps of data collection and analysis through wave number plots were conducted within a version of the DaViTpy Multiple Signal Classification (MUSIC) module revised to include three radar beams in initial plotting and enhanced filtering. The module was first used to locate gravity waves within 14 days before and 7 days after each earthquake. Basic, unfiltered range-time-intensity (RTI) plots were created for every six hour interval over the entire time span for each active radar nearby. A wave number plot, an example of which is shown in figure 7, was then created for each active radar near the epicenter for each time interval with a gravity wave pattern. Wave number is a measure of the number of wave cycles in one meter. The plots show the spectrum of wave intensity (colorscale) over wave number in the y-direction (North/South) and wave number in the x-direction (East/West). This is calculated by the module for the two-hour period with the strongest wave pattern using Fast Fourier Transforms (FFTs) and slope comparison from three beam directions of the filtered RTI plot. The resulting plot shows the direction of propagation of the gravity waves by the position of the power peak in the x-y plane oriented like a compass. For example, a peak in the first quadrant shows a wave propagating Northeast whereas a peak in the third quadrant shows a wave propagating Southwest. The plot description prints the wavelength, intensity and azimuth of the propagation angle. To get an idea of the overall wave patterns around each epicenter, the results of the wave number plots were combined onto a map generated by the SuperDARN

Physics and Engineering Research Radar Coverage Tool [14]. Wavelengths greater than 750km correspond to a very small wave number and were thus excluded. The mapped patterns, recorded wavelengths and initial durations were compared from gravity waves before and after six earthquakes.

Figure 7: Example wave number plot from Fort Hays West (fhw) radar with three waveforms numbered from highest to lowest intensity (power) (created with DaViTpy MUSIC)

Street Broad Scientific Volume 1 | 2011-2012

(upper atmosphere) point-source waves, which are known to peak at 250 km. These results rule out tropospheric storms and geographic features as potential sources since they produce wave -chains [8] rather than point-source waves.

Figure 8: Adak West (adw) wave number plot for 5/18/15 21:00-23:00

3. Results The results from the wave number plots were combined onto maps (figures 10-15, maps from: Radar Coverage Tool, VT SuperDARN) with magnetic coordinates, the FoVs of radars used (shaded orange for mid-latitude, blue for high-latitude) and their centers marked with a yellow dot. The epicenter of each earthquake is marked with a star, and the direction of propagation of each wave is given by a black arrow. The placement and direction of the arrows were determined by the azimuth and were drawn relative to the center of the radar field of view. As an example, the wave number plots before the May 24th, 2015 earthquake are shown in figures 8-9. The Adak West (adw) radar plot (figure 8) shows Southeast and Northwest traveling waves relative to the center of the radar FOV, and the Adak East (ade) plot (figure 9) shows Northeast and Southwest traveling waves. These results were combined to create the first map in figure 15. Frissell et al. [8] found that most gravity wave disturbances caused by geomagnetic activity travel Southeastward in the Northern hemisphere. Thus, for all recorded waves with propagation direction near Southeast, wave conditions were considered 48 hours before and after the plot time. If Southeast waves were persistent within the time interval, they were attributed to geomagnetic sources rather than the earthquake and excluded from mapping and further analysis. Wavelengths varied dramatically from 150 to 750 km, but most were between 200 and 300 km. This is consistent with those found by Frissell et al. [8] for thermospheric

Figure 9. Adak East (ade) wave number plot for 5/18/15 23:00- 5/19/15 1:00

Figure 10-15. Maps of AGWs before (left) and after (right) each earthquake Excluding those attributed to geomagnetic sources, AGWs were found traveling in at least two directions away from the epicenter before all sixearthquakes. More radial patterns were found before and after the January 5, Volume 5 | 2015-2016 | 47

Street Broad Scientific Volume 1 | 2011-2012

2013 earthquake and the September 25, 2014 earthquake. Except for those before the May 29, 2015 earthquake, which began 11 days before, disturbances before earthquakes were observed 3-5 days in advance and lasted between 2 and 3 hours. The earthquakes for which wave patterns were observed after, disturbances were observed 1-3 days after and lasted between 2 and 6 hours.

4. Discussion The direction of propagation, duration and wavelength of observed waves were the main target variables of this study. The January 5, 2013 and May 29, 2015 earthquakes had the strongest radial wave patterns found both before and after. The January 5, 2013 earthquake had one of the largest magnitudes (7.5), and the September 25, 2014 earthquake had the largest depth (108.9 km). However, more data were available for both these earthquakes and thus further study would be required to determine whether the stronger correlation was an effect of the magnitude, depth or amount of data. Observation of waves occurring 3-5 days in advance, lasting between 2 and 3 hours is consistent with our hypotheses and the results by Li et al. [5]. Most waves succeeding earthquakes occurred 1-2 days after, but an additional wave disturbance was observed 3 days after the September 25, 2014 earthquake. This is likely due to the epicenter’s location near water and the effect of tsunami waves, as predicted by Galvan et al. [7]. There is also a wider variability in durations after earthquakes (ranging from 2 to 6 hours). This variability suggests that other factors, such as depth, magnitude and location on land or in water, have a greater effect on ionosphere interaction after earthquakes compared to before.

5. Conclusion These observations consistently show strong patterns of gravity waves before earthquakes, and show that those before earthquakes were more consistent with our hypotheses than those after. Additionally, observations show a possible correlation between magnitude and depth of earthquakes and the duration of gravity wave disturbances. Determining this relationship would dramatically improve our understanding of the electric mechanism responsible for initiating waves. The next steps for this research include a study of the waves associated with earthquakes in the Southern hemisphere. There were 12 earthquakes which occured between 2006 and 2015 with a Richter magnitude greater than 6 near the field of view of the Unwin and Tiger (unw, tig) radars in the Southern hemisphere for which data are listed as available in the SuperDARN Data Inventory, but are inaccessible through the plotting system used in this study. These earthquakes include a wider range of magnitudes (6.6-8.1) than those in this study and thus could show a stronger relationship between magnitude and 48 | 2015-2016 | Volume 5

Physics and Engineering Research AGW properties. From here, models could be generated to improve the LAI coupling model and work toward predicting large magnitude earthquake occurrences.

6. Acknowledgements This research was conducted at the North Carolina School of Science and Mathematics (NCSSM). I thank Dr. Jonathan Bennett of NCSSM, Dr. Jef Spaleta of the University of Alaska Fairbanks, and Nathaniel Frissell of Virginia Tech for their mentorship and guidance. I also thank Bill Meek for the Linux laptop on which the python toolkit ran and the other students in the NCSSM Research in Physics program for guidance and support.

7. References [1] Hocking, W. K. (2001). Buoyancy (gravity) waves in the atmosphere. Retrieved June 15, 2015, from grav_wav.html. [2] Wang, S., & Zhang, F. (2011). Gravity Waves from Midlatitude Weather Systems. Retrieved Sept. 12, 2015, from workshops/11_02_ Chapman_Conference/Oral/ Day5.A.Wang.pdf. [3] Huang, C. S., Andre, D. A., & Sofko, G. J. (1998). Observations of solar wind directly driven auroral electrojets and gravity waves. Journal of Geophysical Research [Online]. Retrieved Sept. 12, 2015. [4] Weimer. Substorm Convection Patterns. University of California at Los Angeles. Retrieved September 15, 2015 from gem/poster/weimer/substorm. [5] Li, J., Meng, J., You, X., Zhang, R., Shi, H., & Han, Y. (2015). Ionospheric total electron content disturbance associated with May 12, 2008, Wenchuan earthquake. Geodesy and Geodynamics. June 13, 2015. [6] Hegai, V., & Kim, V., Nikiforova, L. (1997). A possible generation mechanism of acoustic- gravity waves in the ionosphere before strong earthquakes. Earthquake Pred. Res. 6 (4), 584–589. Retrieved July 15, 2015.[7] Galvan, D. A., Komjathy, A., Hickey, M. P., Stephens, P., & Snively, J. (2012). Radio Science vol. 47 (4). Retrieved June 20, 2015. [8] Frissell, N. A., Baker, J. H., Ruohoniemi, J. M., Gerrard, A.J., Miller, E.S., Marini, J. P., West, M. L., & Bristow, W. A. (2014). Climatology of medium-scale traveling ionospheric disturbances observed by the midlatitude Blackstone SuperDARN radar. J. Geophys. Res. Space Physics, 119, 7679–7697. Retrieved June 10, 2015. [9] Ruohoniemi, J., Baker, J., Frissell, N. A., deLarquier, S., & Thomas, E. (2012). Remote sensing of the ionosphere and Earth’s surface with HF radar. Virginia Tech. Retrieved June 19, 2015.

Physics and Engineering Research

Street Broad Scientific Volume 1 | 2011-2012

[10] Earthquake Archive Search. (n.d.). US Geological Survey. Retrieved June 15, 2015, from http://earthquake. [11] SuperDARN Radar Finder.â&#x20AC;? (n.d.). Virginia Tech. Retrieved June 17, 2015, from tiki-index.php?page=radarFinder. [12] SuperDARN Data Inventory. (n.d.). Virginia Tech. Retrieved June 17, 2015 from tiki-index.php?page=Data+Inventory. [13] Ribeiro, A., deLarquier, S., Frissell, N. A., Spaleta, J., Reddy, B., & Stern, K. SuperDARN DaViTpy. (2012). [Computer software]. Virginia Tech. Retrieved June 20, 2015, from [14] SuperDARN Radar Coverage Tool. (n.d.). Virginia Tech. Retrieved June 17, 2015, from http://vt.superdarn. org/tiki-index.php?page=radarFoV.

Volume 5 | 2015-2016 | 49

Street Broad Scientific

Physics and Engineering Research

Volume 1 | 2011-2012

Combination of Microneedles and Ultrasound for the Transdermal Treatment of Melanoma Sophia Hu ABSTRACT Transdermal chemotherapeutic drug delivery is a promising and attractive method for skin cancer treatment. The current preferred method for treating skin cancers is surgery, but surgical methods are less effective with more malignant or metastasized skin cancers such as melanoma. Other treatments like oral chemotherapy are often systemic and incite many negative side effects, such as weight loss, nausea, and vomiting. Additionally, more convenient drug delivery methods such as transdermal drug delivery have been tried in the past, but problems such as low drug permeability still pose persistent challenges. In this study, an ultrasound-triggered drug delivery system was used to release an anticancer drug, doxorubicin, from microparticles loaded within microneedles as a unique transdermal method of treating skin cancers. Doxorubicin release was proportionate to the length of the ultrasound exposure applied. More importantly, ultrasound application enhanced controlled doxorubicin release from microneedles. Cell viability assays using trial sample solutions found greater cell death rates from longer ultrasoundâ&#x20AC;&#x201C;exposed trials, confirming the efficacy of the ultrasound application in promoting doxorubicin release. Taken together, these findings indicate that drug delivery through combined use of ultrasound and microneedles is a promising approach towards conveniently treating malignant skin cancers while reducing systemic side effects.

1. Introduction Skin cancer is the one of the most common cancers within the United States [1]. Malignant melanoma has the lowest survival rate out of all skin cancers, with a 16% 5-year survival rate when discovered in its distant and most metastasized state [2]. Currently, the most common and preferred way of treating melanoma is with surgical excision. The surgical process involves cutting out the melanoma, along with a margin of healthy tissue to account for any possible metastasis of tumor cells. In patients with more advanced melanoma (stages III and IV), surgery is followed by the use of chemotherapy and/or radiation [3]. The only FDA-approved chemotherapeutic drug for skin cancer is dacarbazine (DTIC), although other drugs commonly examined in studies include doxorubicin, temozolomide, and paclitaxel through IV-infusion [4]. These current treatments pose multiple problems for patients. First, surgery for advanced melanoma can miss metastasized cells, which can allow for continued tumor growth [4]. Many of these chemotherapeutic drugs must be ingested orally. Therefore, they often must pass through the digestive process and are more systemic, increasing the amounts of drugs needed. This causes many toxic side effects, such as vomiting, hair loss, anemia, loss of appetite, and fatigue [5]. Also, the traditional approach of injecting drugs using hypodermic needles is painful and can cause unnecessary infections, making it inconvenient for the patient [6]. Furthermore, surgery is generally inconvenient and expensive for patients and carries the risk of infection. An alternative and more promising approach for advanced melanoma treatment is through transdermal drug delivery (TDD). Transdermal delivery can release chemotherapeutic drugs in lower and thus less systemic quan50 | 2015-2016 | Volume 5

tities through the skin. One form of transdermal drug delivery is using microneedle (MN) arrays. These arrays, or patches, are generally 300-1500 Îźm long. MNs are an attractive approach towards drug delivery because they bypass the stratum corneum, allowing them to deliver higher-molecular weight drugs as compared to traditional transdermal delivery methods. Furthermore, they do not cause bleeding or pain [7] and reduce risk of needlepuncture infections [8]. Dissolvable MNs in particular ensure biodegradability and biocompatibility [9]. Ultrasound has been shown to facilitate controlled drug delivery from drug-encapsulated particles on demand. For example, when loaded into particles, researchers were able to trigger the release of glucose from glucose-responsive nano-networks in mice using an ultrasound application [10]. This ultrasound treatment is an extrinsic form of stimulation that carries great potential in releasing drugs such as doxorubicin from ultrasoundresponsive particles loaded within MNs [11, 7]. However, to the best of knowledge, there is no study in the existing literature that has examined the efficacy of the integrated use of microneedles and ultrasound for the treatment of skin cancers. This combination of microneedles with ultrasound may provide a new method to effectively deliver anticancer drugs while reducing the negative effects on patients. This research focused on assessing the efficacy of combining microneedles and ultrasound for delivery of a chemotherapy drug, doxorubicin. The hypotheses of the research were: 1) ultrasound application induces a controlled release of doxorubicin from doxorubicin-particles; 2) ultrasound application increases doxorubicin release from doxorubicin-particles loaded upon microneedles; and 3) ultrasound-enhanced doxorubicin release corre-

Physics and Engineering Research sponds to high cell death of murine melanoma cells in vitro.

2. Materials and Methods 2.1 Chemicals All chemicals were used as instructed by suppliers. Poly(d, l-lactide-co-glycolide) (PLGA), dichloromethane (DCM), acetone, dimethyl sulfoxide (DMSO), and 2-hydroxy-4’-(2-hydroxyethoxy)-2-methylpropiophenone (2595) were purchased from Sigma Aldrich. Doxorubicin hydrochloride was purchased from TSZ Chemistry. Alginic acid sodium salt was purchased from MP Biomedicals. 300 kDa hyaluronic acid (HA) was purchased from Shandong Freda Biochem Co. Methacrylic anhydride (MA) was purchased from Polysciences. 2.2 Preparation of doxorubicin microparticles Doxorubicin microparticles (DOX-MPs) were prepared via double emulsion and electro-spraying methods described by Di et al. [11]. Briefly, double emulsion refers to the process by which the hydrophilic drug in a polar solution is first suspended into a nonpolar solution, and then transferred back into a polar solution, where the hydrophobic chains of the nonpolar solution self-encapsulate into microspheres. The polar phase for this study was an alginate solution, which was prepared beforehand at 0.5%, 1%, and 2% concentrations in water dissolved overnight. The 1% and 2% solutions were centrifuged to eliminate impurities in the solution. The nonpolar phase of the double emulsion was prepared by dissolving 180 mg of poly(d, l-lactide-co-glycolide) (PLGA) into 4.5 mL of dichloromethane (DCM). Electro-spraying uses an electrical field to disperse droplets from a needle at high electric potential to form microparticles. On the day of the procedure, 5.0 mg DOX were dissolved in 0.5 mL water using a sonicator and vortex to facilitate DOX dissolution. The DOX solution was added to the nonpolar phase and then sonicated at 40% amplitude for a total of two minutes (sonicated every other second) to ensure a homogeneous solution. This mixture was added to 25 mL of 1.0% alginate solution and again sonicated for another two minutes. Then, the mixture was poured into 250 mL of 0.2% alginate solution and stirred for 2 hours to allow the DCM to evaporate. The mixture was divided into eight 50-mL centrifuge tubes (each containing ca. 35ml), centrifuged for 15 min, and the top supernatant removed. The particles at the bottom were re-suspended, combined into two tubes, and centrifuged again. For each tube, the particles at the bottom were suspended into 1.0 ml of water, and 3.0 mL of centrifuged 2% alginate solution was added to ensure a 1:3 ratio of water to alginate. This solution was sonicated to produce a homogeneous suspension. This mixed suspension was crosslinked with a 20 mM BaCl2 solution to form microparticles using the electrospray procedure with a needle potential of 7 volts and flow rate of 0.155 mL/min. BaCl2

Street Broad Scientific Volume 1 | 2011-2012

was used as a crosslinker because it was found to be more biocompatible with cells [11]. These particles were stored in a sterile tube at a concentration of 1.84e-4 M of DOX at 4 degrees Celsius (4 oC) [11]. 2.3 Preparation of hyaluronic microneedles The microneedles (MNs) were prepared by a general procedure of centrifuging methacrylated hyaluronic acid solution with crosslinker (m-HA/crosslinker) solution into silicon molds [11]. Hyaluronic acid (HA) is a carbohydrate found widely in many of the human body’s tissues, thus ensuring biodegradability [12]. m-HA solution was prepared by mixing hyaluronic acid with water in a 20 mg/mL ratio and dissolved overnight. Then, 4.0 mL methacrylic acid (MA) was added to drop the pH of the solution to 2. To bring the pH of the solution to about 8, 400 μL of 5.0 M NaOH were added in a cold room; 50200 μL of NaOH was then further added every 10 minutes until the pH stabilized at about a pH of 8. The m-HA was then precipitated in acetone, washed with ethanol, and dissolved in DI water. This solution underwent dialysis for 48 hours, and frozen at -80 oC overnight. Lyophilization in a Labconco FreeZone lyophilizer for two days had a 87.5% yield. This solid was dissolved in deionized water at a concentration of 4% w/v of m-HA and MBA each and 0.005% w/v photoinitiator. The solution was in an amorphous phase until cross-linked. 1 To make the HA-MNs, silicon microneedle molds purchased from Blueacre Technology Ltd. were used. The molds were first washed three times in tap water, and then once in deionized water. As illustrated by Figure 1, a thick layer of m-HA/crosslinker solution was added on top of each mold and allowed to dry in a dessicator until viscous enough to be placed into a centrifuge.

Figure 1. Procedure for making m-HA/crossover microneedles. Molds were rotated at 4000 rpm for 5.0 minutes with the lids of the rotation carriages open to facilitate faster drying. Following this centrifugation, additional layers of m-HA/crosslinker solution were added, each followed by Volume 5 | 2015-2016 | 51

Street Broad Scientific Volume 1 | 2011-2012

centrifugation, until the molds appeared white due to the solution completely filling the tips of the molds. One final layer of m-HA/crosslinker solution was added, and the molds were placed in a dessicator overnight to dry (Fig. 1). The MNs could then be peeled off and crosslinked using UV light at a wavelength of 365 nm for 10 seconds, then stored at room temperature. Each microneedle was pyramidal in shape, with a height of 800 μm, base length of 400 μm, and with 200 μm in between each microneedle (Fig. 2).

Figure 2. Scanning electron micrograph of the microneedles at 50x magnification. Hydrogels used in the microfluidic experiments were created through crosslinking the liquid m-HA/crosslinker solution. This was achieved by pouring 200 μL of the solution into a polydimethylsiloxane (PDMS) mold, and then shone under an ultraviolet light for 10 seconds. This resulted in a jelly-like gel that was able to hold its shape when picked up and manipulated. 2.4 Preparation of murine melanoma cells Murine melanoma cells (ATCC-B16-F10) were used to model melanoma. These cells were chosen for their similarity to human melanoma. Cells were grown as instructed in 10% fetal bovine serum DMEM solution and passaged every 2-3 days in an incubator at 37oC with 5% CO2 [13]. An MTT assay was used as instructed to evaluate cell viability [14]. 2.5 Measurement of DOX release from microparticles into vials following ultrasound exposure An ultrasound was first applied at varying time intervals to DOX-MP in a vial to verify their ultrasoundresponsiveness. An amplifier and wave generator was attached to a transducer, which was suspended within a tank of water. The wave generator was set to burst mode, with 950 kHz frequency, 400 mVpp amplitude, and burst intervals of 20 μs. The focal point was found by adjusting the height of the transducer below water until the maximal spray of water was released from the water surface. Then, 1.0 mL of 1.84e-4 M DOX of DOX-MP was washed with phosphate-buffered saline (PBS) solution and then placed in a vial, at a final concentration of 1.84e-4 M DOX. The vial was held over the focal point of the trans52 | 2015-2016 | Volume 5

Physics and Engineering Research ducer at the set amount of time. After application, the supernatant was removed. Because DOX inherently possesses its own fluorescence, and the fluorescence intensity correlates with its concentration (Mohan & Rapoport, 2010), the collected supernatants were evaluated through detecting the fluorescence using a Tecan Infinite M200 Pro microplate reader. The process was repeated for three trials at each application time length. 2.6 Measurement of DOX release from DOX-MP on MN using ultrasound Controlled DOX release from DOX-MP on MNs was modeled two different ways to examine the effectiveness of using MN as a mechanism for facilitating DOX release. First, a microfluidic setup was used to model the transdermal flow of blood through microneedles. This model examined DOX release at given time intervals. Second, a dialysis membrane was used to assess the total release of DOX. Set time intervals of ultrasound were applied to both structures. 2.6.1 Doxorubicin release modeled with microfluidics DOX release from MN’s was first modeled using microfluidics. A microfluidic chip made by another laboratory was used. As shown from Figure 3, an input pump containing PBS attached to one end and an output pump collected the solution at the other end; PBS solution flowed through the middle of the microfluidic chip, where the DOX-MN rested. The pumps operated at a flow rate of 100 μL per minute.

Figure 3. The microfuidic chip used to model DOX release from DOX-MP loaded upon MN The DOX-MN was set up as follows: first, a piece of Dragon Skin film with a hole reservoir was laid on top of the MN, and the reservoir filled with a piece of crosslinked hydrogel. Next, another layer of Dragon Skin with a hole was laid on top, and the hole reservoir filled with 200 μL of 1.82E-3 M of DOX-MP. Then, the entire structure was covered with Parafilm to avoid direct contact between the DOX-MPs and the ultrasound apparatus (Fig. 4). After 10 minutes of starting the pumps to allow the DOX-MP to reach the microneedle tips, ultrasound was applied. The sample solution that passed through the microneedles was collected every ten minutes for an hour. Three replicate trials of MN were used for each ultrasound application

Street Broad Scientific

Physics and Engineering Research time length.

Figure 4. Microfludic set-up for measuring DOX release: Ultrasound-induced cavitation released DOX from DOX-MP and facilitated DOX flowing through the hydrogel and microneedle into the PBS solution, where the withdrawal pump collected the sample solution. 2.6.2 Doxorubicin release modeled through dialysis Ultrasound-triggered DOX release from DOX-MN was also modeled using a dialysis tube. This was to observe the accumulated release of DOX into a solution, not just the release at a certain point in time like the microfluidic trials. Dialysis uses a semi-permeable membrane to allow particles under certain molecular weights across the membrane. Particles of a higher concentration on one side of the membrane will move through diffusion to the other side, until eventually an equilibrium is established on both sides of the membrane. As shown by Figure 5, a barrel with a bottom composed of dialysis membrane was placed into a tube. A microneedle was added to the bottom of the dialysis barrel. Then, a layer of DOX-MP was added on top, and the barrel was filled with water. The rest of the tube was then filled with water. Parafilm was used to wrap the barrel and ultrasound was then applied on the Parafilm by immersing it above the focal point of the ultrasound in a water tank for specific periods of time.

Figure 5. Ultrasound promoted DOX-release into dialysis solution

Volume 1 | 2011-2012

2.7 Measurement of cell viability The efficacy of ultrasound-enhancement of doxorubicin via microneedles was further evaluated by determining the cell viability of B16-F10 murine melanoma cells following the application of sample solution prepared by the same process of the ultrasound trials described in section 2.5. A 96-well plate was seeded at a density of 5,000 cells per well and incubated overnight. Supernatant was prepared by adding 100 μL of DOX-MP with 1.5 mL of water to vials and applied with ultrasound for 0, 2.0, and 4.0 minutes. 15 μL of this supernatant was applied to each well. After 24 hours of incubation, 20 μL MTT reagent was added per well and the well plate was incubated for 3.5 hours. Then, MTT solvent was added and the well plate shaken for 15 minutes before being evaluated spectrophotometrically for cell viability.

3. Results 3.1 Formation of doxorubicin microparticles All DOX-MPs were spherical with various sizes (Figure 6A). As shown by Figure 6B, the average size of the DOX-MPs was 536 μm (±178 μm).

Figure 6. A. Microscope image of DOX-MPs (left); B. Histogram of the DOX-MP diameters out of a sample of 500 (right). 3.2 Ultrasound effect on doxorubicin release from microparticles DOX release from microparticles was evaluated following ultrasound exposures to confirm the ultrasoundresponsiveness of the microparticles and to determine the time course of DOX-release from the MPs as influenced by different time lengths of ultrasound applications. DOX release was analyzed after four time intervals (0, 0.5, 1.0, 2.0 and 4.0 minutes) of ultrasound application. With no ultrasound application, there was barely any noticeable release of DOX, and the DOX that was detected was attributed to passive release from the MPs (Fig. 7). At half a minute of ultrasound, the fluorescence was 11,120 fluorescence units (FUs), and increased to 37,048 FUs after 4 minutes of ultrasound treatment. The ultrasound was also observed to slightly heat the vial. When plotted using a mathematical program, the release was linear, with a correlation coefficient of 0.996. At the longest application length, four minutes, the maximum dose Volume 5 | 2015-2016 | 53

Street Broad Scientific Volume 1 | 2011-2012

of DOX was released, at a release efficacy of 96%.

Figure 7. DOX flourescence at various times of ultrasound exposure. Values were ± 1 standard error (SE). 3.3 Ultrasound-triggered doxorubicin release from microneedles using microfluidics and dialysis simulations

Physics and Engineering Research the control remained at about the same level across the 60-minute period. However, fluorescence levels quickly increased in the ultrasound-treatment to reach ca. 45,000 FUs by 20 min and remained significantly higher than the control across the 60-minute period. 3.3.2 Dialysis simulation Doxorubicin release using dialysis membranes and tubes was evaluated with three trials at each ultrasound application time. Ultrasound was applied for 0 and 4 minutes, and DOX release was monitored for 90 mins, sampled at the time intervals of 0, 10, 20, 30, 60 and 90 mins. Compared to the control, ultrasound application significantly increased DOX-release after the first measurement, that is, 10 min after the experiment (Fig. 9). DOX release in the ultrasound treatment quickly reached the peak by 30 min and then stabilized for the rest of time. In contrast, DOX-release was much slower in the control and remained lower than the ultrasound across the whole test period (Fig. 9).

3.3.1 Microfluidic simulation As mentioned above, transdermal blood flow was first modeled with microfluidics. Control trials of no ultrasound application were conducted against trials with four minutes of ultrasound application, resulting in different DOX release rate curves shown in Figure 8. DOX release rates based on ultrasound application time length culminated in 12.6% released DOX from the DOX-MPs with 4 minutes of ultrasound, as compared to only 2.1% released DOX in the control (Fig. 8). Figure 9. DOX-release (quntified by fluorescence), modeled using dialysis tubing. Trials were applied with 0 (blue solid line) and 4 (red dash line) minutes of ultrasound. Values were means ± 1 standard error (SE).

Figure 8. DOX-release as shown by the fluorescence over a period of 60 minutes, modeled using microfluidies in the no-ultrasound control (Blue) and in the 4-min treatment of ultrasound (Red). Values were means ± 1 standard eror (SE). For the ultrasound application, ultrasound treatment was introduced after the microfluidic devices and pumps were running for 10 minutes with the MN with DOXMP loaded onto the microfluidic device. Therefore, DOX release was statistically identical for both the control and the ultrasound treatment at time 10 min. DOX release in 54 | 2015-2016 | Volume 5

3.4 Cell viability assay from microfluidic supernatant application An MTT assay was used as per instructions to evaluate the cell viability of the DOX-MP MN-treated cells to evaluate their efficacy [14]. The plate was evaluated spectrophotometrically after 24 hours of incubation to determine cell viability. Results showed that cell viability was lowest in the treatment with sample solutions from the application of 4 minutes of ultrasound, with 85% cells remaining as compared to the trial with no DOX-application (Fig. 10). In contrast, cells remained largely healthy in the controls, with 91% remaining (Fig. 10).

4. Discussion Enhancing transdermal drug permeation is key because one of major technical obstacles for using TDD lies in the low efficiency of delivering large molecules us-

Street Broad Scientific

Physics and Engineering Research ing conventional TDD techniques [7]. Results from this study showed that the use of ultrasound enhanced DOXrelease from microparticles in the model system, and further delivered through the microneedles (Fig. 8, 9). These effects were not only shown by significantly higher DOXfluorescence (Fig. 7, 8, 9) but also verified by altered cell viability in the in vitro experiment (Fig. 10).

Figure 10. Viability of murine melanoma cells in DOX solutions obtained following the ultrasound treatment. Values were means Âą standard error (SE). Ultrasound has been theorized to stimulate drug release from MPs through the cavitation effect, and was shown to successfully achieve the controlled release of an array of drugs previously [15, 16]. Results from this study further demonstrate its effectiveness in releasing DOX. Microneedles have been widely used in transdermal drug delivery because it represents a painless and non-invasive alternative to the traditional hypodermic needles, and were shown to facilitate drug release in multiple ways in prior literature. However, only limited experiments have recently shown that combination of ultrasound and MN can increase the skin permeability of large molecules [7]. This is the first attempt to assess the efficiency of combining ultrasound with microneedles on skin cancer cells. Combination of ultrasound and MNs may provide unique opportunities to facilitate the delivery of drugs such as DOX for skin cancers. The potential additive effect from this combination stems from the different mechanisms that underlie the stimulation of each factor for drug permeability [7]. While ultrasound stimulates molecular release mainly through the cavitation effect, MNs enhance drug delivery primarily due to their capability to directly penetrate skin, thereby bypassing the stratum corneum barrier [8]. From the results (Figs. 8 and 9), it seems that ultrasound was particularly useful to stimulate the movement of drug molecules from the microneedles, indicating the potential of the combination in delivering large molecules of drugs for skin cancers. DOX-MP release from MN was evaluated using microfluidics. Microfluidics were used because the constant flow of fluid through the microfluidic chip simulated the transdermal flow of blood through microneedles tips in vivo. As demonstrated by Figure 8, the accumulated re-

Volume 1 | 2011-2012

lease dose of DOX rapidly increased with ultrasound application and remained at a higher fluorescence level with ultrasound application, whereas the trial with no ultrasound application remained very low in terms of fluorescence. This is the first time to the best of knowledge that using dialysis membranes has been used to simulate release of drugs from particles on microneedles. Since a dialysis tube was used, it is assumed that the samples taken from the immersion water were somewhere along the way towards reaching an equilibrium of equal concentrations of DOX between the dialysis barrel and tube. Since the fluorescence was much lower compared to the microfluidic results, it is assumed that the dialysis membrane likely played a large role by establishing the equilibrium system. Thus, without the dialysis membrane, the system most likely would have released a much higher amount of DOX. However, the results still confirm the hypothesis that ultrasound can trigger the controlled release of DOX, as demonstrated by the significant difference between the 4 minute and 0 minute trials in DOX-release (Fig. 9) and cell viability. Since multiple factors determine the efficacy and usefulness of ultrasound-triggered drug delivery through MNs for clinical applications, there are many parameters that can be modified to further enhance drug release. For ultrasound, its frequency, amplitude, and application time lengths needed are critical because they modulate the power and behavior of the waves. For MNs, the material and strength of the needles can be very important [8]. Ultrasound has been shown to cause the application surface to heat up, a condition that has been known to cause increased blood flow [17]. Because of this and the fact that 4 minutes is a relatively short time period, it may be desirable to reduce the amplitude of the ultrasound to a lower mVpp and allow the application time to be extended in an in vivo trial, lest the heat burn the skin. Thus, a balance between these two factors will be critical in an in vivo evaluation in order to optimize drug penetration without inducing any discomfort to the patient.

5. Future Work Results from this research showed that the integration of microneedles and ultrasound facilitated DOX-release from both DOX-MPs alone and DOX-MPs loaded upon MNs. Extended usage of ultrasound application resulted in reduced cell viability. The next step is to test the system in vivo, thus eliminating the need for microfluidic modeling systems. A possible animal study would be to use the same ultrasound parameters and application times, except by applying the ultrasound to a MN loaded with DOXMP attached to a mouse with a xenograft melanoma tumor. Another possibility is to apply ultrasound to the mouseâ&#x20AC;&#x2122;s bare skin first, before applying the DOX-MP on MN. Han and Das examined this possibility, where they found that using ultrasound application after pretreating Volume 5 | 2015-2016 | 55

Street Broad Scientific Volume 1 | 2011-2012

the skin with microneedles increased cavitation in the lower layers of the skin and thus increased drug permeability with MN application [9]. Thus, this sequence of drug delivery may enhance efficiency for the in vivo treatment of melanoma. Modifications should also be made to maximize the efficacy and targeting potential of the microparticles themselves. These modifications can include designing multiple complex intrinsic response systems for the microparticles to be used in conjunction with ultrasound and MNs. Intrinsic response mechanisms for nanoparticles include pH, redox, glucose, and enzymatic-responses [16]. For example, a recent study has found that pH-responsive, DNA-composed “nanoclews” were able to successfully target MCF-7 cells [18]. This intrinsic responsiveness could add another level of specificity to the MP, thus allowing them to be both internally and externally triggered. Since many combinations of intrinsic response mechanisms can be formulated, this offers a large amount of potential combinations of MP, MN, and ultrasound.

6. Conclusion Our results demonstrated that the combination of ultrasound and microneedles facilitated DOX release. This method showed that by combining the two complementary mechanisms that affect drug permeation, overall efficacy could be improved by alleviating the limitations in each individual component, i.e. low drug permeability and the stratum corneum barrier. Since this method has to the best of knowledge never been used before, all results from this method are novel and act as a promising method of simulating release of drugs from microneedles. Ultrasound was found to successfully release DOX from DOX-MPs, both from DOX-MPs alone and in combination with MNs. Transdermal blood flow was modeled using microfluidics and dialysis membranes. Both these modeling systems confirmed the higher release rate of DOX from DOX-MP, with the microfluidics through simulating blood flow and assessing DOX-release at points in time, and the dialysis membranes through assessing the total amount of DOX-release. The efficacy of the method was further confirmed by cell viability assays, where lower cell viability correlated to higher ultrasound application. This model of combining ultrasound application with microneedles holds promise for the treatment of melanoma, especially as a treatment after initial surgery to reduce the risk of missing metastized cells.

7. Acknowledgments Thank you to Dr. Zhen Gu, professor at NCSUUNCCH Joint Biomedical Engineering Program for giving insight on the validity of the project idea and trial designs; Jin Di, post-Ph.D student at Dr. Gu’s lab for teaching standard protocol for making microparticles and using the ultrasound machine; Yanqi Ye, graduate 56 | 2015-2016 | Volume 5

Physics and Engineering Research student at Dr. Gu’s lab for teaching protocol for making microneedles and giving insight on trial design validity; NCSSM for supplying transportation to and from the research facility; Dr. Myra Halpin, for giving insight on project design validity.

8. References [1] CDC. (2012). United States Cancer Statistics. Retrieved September 18, 2015, from uscs/toptencancers.aspx [2] Howlader N, Noone AM, Krapcho M, Garshell J, Miller D, Altekruse SF, Kosary CL, Yu M, Ruhl J, Tatalovich Z, Mariotto A, Lewis DR, Chen HS, Feuer EJ, Cronin KA (eds). SEER Cancer Statistics Review, 19752012, National Cancer Institute. Bethesda, MD, http://, based on November 2014 SEER data submission, posted to the SEER web site, April 2015. [3] Melanoma. (2014, June 16). Retrieved September 16, 2015, from [4] Smyth, E., & Carvajal, R. (n.d.). Skin Cancer Foundation. Retrieved May 16, 2015, from [5] An, M., Wijesinghe, D., Andreev, O.,Y. K. R. and D. M. E. (n.d.). pH-(low)-insertion-peptide (pHLIP) translocation of membrane impermeable phalloidin toxin inhibits cancer cell proliferation on JSTOR. Retrieved May 23, 2015, from arch=yes&resultItemClick=true&searchText=chemother apy&searchText=side&searchText=effects&searchUri=% 2Faction%2FdoBasicSearch%3FQuery%3Dchemothera py%2Bside%2Beffects%26amp%3Bacc%3Don%26amp% 3Bwc%3Don%26amp%3Bfc%3Doff%26amp%3Bgroup %3Dnone&seq=1#page_scan_tab_contents [6] Liu, S., Jin, M., Quan, Y., Kamiyama, F., Katsumi, H., Sakane, T., & Yamamoto, A. (2012). The development and characteristics of novel microneedle arrays fabricated from hyaluronic acid, and their application in the transdermal delivery of insulin. Journal of Controlled Release : Official Journal of the Controlled Release Society, 161(3), 933–41. [7] Han, T., & Das, D. B. (2015). Potential of combined ultrasound and microneedles for enhanced transdermal drug permeation: a review. European Journal of Pharmaceutics and Biopharmaceutics : Official Journal of Arbeitsgemeinschaft Für Pharmazeutische Verfahrenstechnik e.V, 89, 312–28. ejpb.2014.12.020 [8] Donnelly, R. F., Raj Singh, T. R., & Woolfson, A. D. (2010). Microneedle-based drug delivery systems: microfabrication, drug delivery, and safety. Drug Delivery, 17(4), 187–207. [9] Han, T., & Das, D. B. (2013). Permeability enhancement for transdermal delivery of large molecule using

Physics and Engineering Research

Street Broad Scientific Volume 1 | 2011-2012

low-frequency sonophoresis combined with microneedles. Journal of Pharmaceutical Sciences, 102(10), 3614– 22. [10] Di, J., Price, J., Gu, X., Jiang, X., Jing, Y., & Gu, Z. (2014). Ultrasound-triggered regulation of blood glucose levels using injectable nano-network. Advanced Healthcare Materials, 3(6), 811–6. adhm.201300490 [11] Di, J., Yao, S., Ye, Y., Cui, Z., Yu, J., Ghosh, T. K., … Gu, Z. (2015). Stretch-Triggered Drug Delivery from Wearable Elastomer Films Containing Therapeutic Depots. ACS Nano. [12] Papakonstantinou, E., Roth, M., & Karakiulakis, G. (2012). Hyaluronic acid: A key molecule in skin aging. Dermato-Endocrinology, 4(3), 253–8. http://doi. org/10.4161/derm.21923 [13] ATCC. (n.d.-a). B16-F10 ATCC ® CRL-6475TM Mus musculus skin melanoma. Retrieved September 17, 2015, from aspx [14] ATCC. (n.d.-b). MTT Cell Proliferation Assay. Retrieved September 17, 2015, from media/DA5285A1F52C414E864C966FD78C9A79. ashx [15] Barry, B. (2001). Novel mechanisms and devices to enable successful transdermal drug delivery. European Journal of Pharmaceutical Sciences, 14(2), 101–114. [16] Lu, Y., Sun, W., & Gu, Z. (2014). Stimuli-responsive nanomaterials for therapeutic protein delivery. Journal of Controlled Release : Official Journal of the Controlled Release Society, 194C, 1–19. jconrel.2014.08.015 [17] Charkoudian, N. (2003). Skin blood flow in adult human thermoregulation: how it works, when it does not, and why. Mayo Clinic Proceedings, 78(5), 603–12. http:// [18] Sun, W., Jiang, T., Lu, Y., Reiff, M., Mo, R., & Gu, Z. (2014). Cocoon-like self-degradable DNA nanoclew for anticancer drug delivery. Journal of the American Chemical Society, 136(42), 14722–5. ja5088024

Volume 5 | 2015-2016 | 57

Street Broad Scientific Volume 1 | 2011-2012

Mathematics and Computer Science Research

Modelling Causes of Honey Bee Colony Collapse Through Population Dynamics Cynthia Dong


Honey bees, Apis mellifera, have experienced large hive failures since 2006. In order to consider some of the factors that may have caused such hive failures, as well as the puzzling phenomenon of Colony Collapse Disorder, models have been created to simulate the effect of death rates, food availability, and egg laying rate on a colony of bees. Differential equations [4],[2] are modeled with Vensim. Various variables are manipulated and used to create graphical representations of how hive populations change in response to changes in their environment. As forager deaths increase, social inhibition decreases, leading to even higher mortality rates, which contributes to hive collapse. When food availability and the laying rate of the queen decrease, hive failure ensues, and when forager mortality is critically high, the entire colony will die. This paper suggests various factors that may have led to the large number of hive failures since 2006. It also provides support for tactics to ensure honey bee survival. Keywords: Honey bee; Apis mellifera; bee population dynamics; Colony Collapse Disorder; and mathematical model

1. Introduction In the past decade, the western honey bee, Apis mellifera, has experienced alarming decreases in population. Honey bees and other pollinators cross pollinate over 30% of the worlds crops and over 90% of non-domesticated plants [14], implying severe economic and agricultural ramifications if honey bees continue to experience losses at the current rate. The recent concern over increasing hive failure began in 2006, when many beekeepers in the United States noticed a 30-90% drop in their honey bee populations. In circumstances where the cause of hive failure was unknown, the phenomenon was labelled Colony Collapse Disorder, or CCD [11]. Although instances of CCD have decreased slightly from 2010, beekeepers continue to experience abnormally high colony losses year round [15]. The threat of CCD recurring is a real one, with implications for the Earthâ&#x20AC;&#x2122;s growing population and its need to be fed. Colony Collapse Disorder is characterized by colonies where adult worker bees have abandoned their hives despite the presence of a live queen, food stores, and sealed brood. This is very abnormal activity. In hives found to be actively collapsing, only young workers remained. The remaining workers were reluctant to be fed by beekeepers. No dead bees were found near the entrance of the hives [16]. European countries experienced similar losses in the same time frame, but there is not strong consensus on whether the losses were caused by CCD or by other unrelated issues. [10]. Scientists have identified various factors that affect hive survival rates, such as parasites, viruses, pesticides, and climate change, but Colony Collapse Disorder remains unexplained. One theory for the rapid decline in bee populations involves how bees progress through their social hierarchy, and how outside stressors may negatively alter the social dynamics of the colony [4]. In a honey bee colony, there are four main groups. First 58 | 2015-2016 | Volume 5

and foremost is the queen, who mainly lays eggs, ensuring future generations of workers for the hive. She is waited on by the next level in the hive hierarchy- the worker bee. Worker bees, who are all female, are split into two castesthe hive bee and the field bee. Hive bees, also known as house bees, are responsible for maintaining the hive and its inhabitants. After three weeks, a hive bee will graduate to eld bee, also known as a forager bee. The role of the field bee is to gather nectar and water, among other things necessary for the hives' survival. The lowest level of the honey bee hierarchy consists of the drones, male bees whose sole purpose is to fertilize the queen [5]. Healthy colonies must have well-balanced demographics with the right percentage of each type of worker in order to ensure the survival of the hive. Foragers release pheromones, maintaining the rate at which hive bees mature, which then ensures that enough hive bees remain to take care of the brood, queen, and food stores. However, when there is a shortage of field bees, hive bees compensate by transforming and then leaving the colony to forage. Not only do these premature field bees complete fewer foraging trips with a higher death toll during their initial flights when compared to naturally matured field bees, they leave an understaffed hive [13],[20]. Threats, such as the varroa mite, neonicotinoids, early blooming flowers, habitat destruction, and invasive species [12] all place more stress on the colony, upsetting its delicate balance, which may then lead to Colony Collapse Disorder. Many different factors influence honey bee colony survival. In this paper, the effect of these factors on honey bee population dynamics is modeled using differential equations adapted from Khoury et al.

2. Model and Methods The two primary models in this paper are built with the systems dynamics software Vensim and based upon dif-

Street Broad Scientific

Mathematics and Computer Science Research ferential equations created by Khoury et al. They serve to simulate the situations that may lead to hive failure or Colony Collapse Disorder through the manipulation of certain variables. These models allow for projections of hive numbers after the environment of the hive changes that can be visualised graphically. Benefits of using algorithms to conduct experiments are numerous- variables are better controlled for, no complex tracking systems need to be set up for each bee, and data collected over 100 days in an experimental setting can be projected in a few seconds. However, the complexity of the model constrains the accuracy of the projections. Since certain assumptions are held to be true, and some variables are unaccounted for, the models are not perfect reflections of what would happen to the simulated hives in reality. 2.1 Social Inhibition Model In Fig. 1, created with Vensim, the various populations of a colony, as well as the factors that affect their rate of change, are pictured. The boxes indicate the different bee populations, and black arrows demonstrate a rate affecting a population. If a black arrow is pointing towards a box, the rate is increasing the population, and if itâ&#x20AC;&#x2122;s pointing away, itâ&#x20AC;&#x2122;s decreasing the population. The clouds are sources and sinks- the sources are unlimited, while the sinks have no maximum capacity. For example, the queen can lay an unlimited number of eggs, thus the cloud for the source of the eggs. The unboxed variables dictate the processes and populations (black arrows and boxes) and the blue arrows indicate that the source of the arrow affects whatever the arrow points at. 2.1.1 Assumptions Drones are excluded, as there are almost always more than enough drones to fertilize the queen. Drones provide no other service to the hive [5]. Field bees are the only population which experiences death; unless acted upon by a brood disease or other factor, hive bees and brood do not die. The brood population includes all parts of the life cycle before eclosion, or hatching from the cell. This model runs in the summer, so seasonal behaviours are not considered.

Volume 1 | 2011-2012

will then convert early into field bees, resulting in higher mortality rates which may eventually lead to colony failure. The changes in three populations were measured: the brood, hive bees, and field bees. Assuming that the modeled hive has an average of 30,000 workers, of which 30% are field bees [9], hive and field bees initially had populations of 21,000 and 9,000 bees, respectively. The brood population is initially at 1500. The brood increases due to the laying of eggs, which is based on the laying rate of the queen. In this model, laying rate (L) is set at 1500 eggs per day [21]. First, the brood ecloses, or hatches, essentially, into the hive bee class. Hive bees are then recruited to the field bee population, becoming foragers, at 21 days old. In a sustainable hive, foragers die at rate m, determined to be 0.15, or 15% [8]. If no field bees are present in a hive, a minimum of 4 days will pass before hive bees will convert to field bees, so uninhibited recruitment ( ) is set at 0.25. Social inhibition ( ), the suppression of the transfer from hive bee to forager via pheremone release, is set at 0.75. The variable (w) controls the rate at which the eclosion rate becomes comparable to the laying rate, and here, it is set to 27,000 [4]. 2.1.2 Equations The rate of change of the hive bee population (H) is shown in Eq. 1, and that of the field bee population (F) is shown in Eq. 2. In Eq. 1, eclosion supplies more bees (E(H; F)) and HR(H; F) represents the rate at which hive bees leave to become foragers. In Eq. 2, changing forager population F then gains that same number of field bees that were converted while losing mF, the number of foragers that die per day.



E(H, F), defined in Eq. 3, describes eclosion, the rate at which hive bees hatch with social inhibition, as a function of the hive bee and field bee populations.


where the total number of worker bees (N) is calculated as R(H, F), defined in Eq. 4, describes the recruitment rate, defines the rate of change from hive bee to field bee. Figure 1: Social Inhibition Model 2.1.2 Values As discussed in the Introduction, if there is a sudden drop in field bee numbers, less of the social inhibition pheromone is released to the hive bees. The hive bees


The effect of premature transition of hive bees to field bees can be modeled with the function

Volume 5 | 2015-2016 | 59

Street Broad Scientific

Mathematics and Computer Science Research

Volume 1 | 2011-2012

where = 0.059. When Eq. 5 is less than one, the death rate increases. The relationship between the effect of precocious foraging and the recruitment rate is determined by the constant , an experimental value. P in Eq. 5 is the ratio between the squared regular recruitment rate and the squared recruitment rate plus , which accounts for precocious foraging. [4]. Premature foraging affects death rate (m), so that the new rate of death (mp) for field bees becomes

mp = m / P


2.2 Food Dependency Model The model displayed in Fig. 2 shows the effect of the availability of food on the population dynamics of a honeybee hive. The population is still described with boxes, the magnitude of which are affected by the rates of laying, pupation, eclosion, recruitment, and death. Unlike in the Social Inhibition Model, food becomes a variable in this model so that the influence of food on population dynamics in honey bees may be shown. This model also includes a delay in time from uncapped brood to capped brood. The focus is on how the availability of food affects the dynamics of a hive.

minimum (min) recruitment, as well as the social inhibition rate ( ) remain the same as in the other model- 0.25, 0.25, and 0.75 respectively. Minimum recruitment occurs when there are no foragers in a hive but sufficient food, and maximum recruitment controls how workers respond to low food stores [4].The variable (b), which dictates the effect of sufficient food stores on survival and transition from hive bee to field bee is set at 5000. The variable (v), the number of hive bees that results in the survival of half of the brood due to inattention, is set at 5,000 [2] . Based on experimental observation, the maximum collection rate of food is 0.099g per forager per day [8]. The availability variable (a) controls what percentage of the max collection rate the field bees are able to achieve. The rates of brood, hive bee, and field bee consumption of the food source are 0.018, 0.07, and 0.07 g/day, respectively [2]. The laying rate remains 1500 eggs per day, as in the Social Inhibition Model. 2.2.3 Equations The rate of change of the brood population is described in Eq. 7.


S, shown by Eq. 8, models the survival of the brood as a function of the number of hive bees (H) and amount of food stores (f ), with low numbers of either hive bees or food stores resulting in low survival rates. The first component of Eq. 8, approaches 1 as food approaches infinity, which fits with the model- the survival rate is maximized.

Figure 2: Population dynamics with the effect of food availability 2.2.1 Assumptions In the Food Dependency Model, bees engage in summertime behaviour, and wintertime behaviour is unaccounted for. Drones are omitted, and all stages up to uncapped brood are included under the population of uncapped brood. Pupation, the transition from larva to pupa, is proportional to brood numbers [4]. 2.2.2 Values This hive begins with the same initial populations of field and hive bees as the previous model. After eggs are laid and hatched, the uncapped brood pupate. The process of getting capped is delayed by a time set at 12 days, the standard time it takes for a western honey bee to exit the cell and emerge as a hive bee . The pupation rate ( ) is set to 1/10, or 0.1, as it takes approximately 10 days from the laying of the egg for the cell to be capped [17],[18],[19]. Maximum (max) and 60 | 2015-2016 | Volume 5


The rate of change in hive bee population is shown in Eq. 9. , where B is a function of time with a delay of 12 days (the time it takes for the bee to emerge after pupation), is the rate of emergence from pupation.


Recruitment as a function of worker populations (H, F) and food stores (f ) is shown in Eq. 10. Recruitment increases with more food and is reduced with social inhibition. (10) The rate at which food stores change, shown in Eq. 11, is dictated by the rate at which foragers gather food (c) as well as the number of foragers. Each division of bees in the hive consume honey at a certain rate, and total honey consumption is based on the populations of the dierent groups. Above equations obtained from Khoury et

Street Broad Scientific

Mathematics and Computer Science Research

Volume 1 | 2011-2012

al. (2013) [2] (11) Finally, the rate at which bees collect honey (c), is affected by the availability (a) and max collection (cmax) variables and is calculated as below:

c = a * cmax[8]

All hives began with 100g of food.


3. Results and Analysis 3.1

Social Inhibition Model

3.1.1 High Forager Death The graphs in Fig. 3 model how various external forces that shorten the lifespan of bees, such as Nosema apis, a parasite that causes dysentery and disjointed wings in worker bees transmitted by the Varroa mite [6], or bees affected by neonicotinoids, a type of pesticide that can cause navigational system loss in foragers, decreasing their ability to return to the hive, as well as other neurological issues [22],[3]. As seen in Fig. 3, the honey bee colony, while moderately sustainable at 15% forager death, as the mortality rate increases from outside forces, the colony quickly crumbles.

Key Figure 3. Honey bee populations with varying mortality rates. 3.1.2 Precocious Foraging The two graphs in Fig. 4 show what occurs when outside factors place pressure on hive bees to transition early into field bees. Hives with no precocious foraging are able to maintain a colony for much longer than those with. The base mortality rate for both models is 0.6, or a 60% death rate. In the colony modeled, when precocious foraging occurs, the worker population drops critically by the 30th day. A hive with the same initial conditions but no precocious foraging, visible in Fig. 4, will be able to sustain itself well past even 50 days.

Ideal Hive (m=0.15)


Figure 4. Worker populations with different levels of precocious foraging. Volume 5 | 2015-2016 | 61

Street Broad Scientific

Mathematics and Computer Science Research

Volume 1 | 2011-2012


Food Dependency Model

3.2.1 Ideal Hive Fig. 5 displays the ideal hive, used as a base for comparison to hives experiencing problems.

Figure 5. An ideal hive, with a mortality rate of 0.15. Food is in units of grams. 3.2.2 Decreased Availability Availability of food for bees may decrease from a number of reasons, one being the destruction of habitats for farming, industry, or residential areas. Climate change may result in early blooms, meaning that by the time that the bees are able to forage, many flowers may have died already. On the other hand, bees may emerge earlier than the plants due to warmer temperatures, resulting in unproductive foraging. Invasive species, such as kudzu, are capable of destroying native flowering plants as well. This model shows how a decrease in availability of food, be it due to absence of flowers or lack of foraging space, may lead to colony collapse. In Fig. 6, it is visible that as the availability decreases, worker population growth drops. Brood population does not decrease, but instead plateaus and is unable to grow further due to low levels of workers able to take care of them. At 50% availability, the hive is still growing, but very slowly. At 40% availability, the worker population drops off signicantly, as do food levels. This makes sense. Referencing Fig. 2, food indirectly affects field bee and hive bee levels. Such population decreases could be one factor as to why CCD has been occurring as it has with brood populations still present, but minimal workers.

availability at 40%

Figure 6. Effects of Food Availability on Population 3.2.3 Decreased Laying Rate Queens may experience infertility or other layingrelated problems after long term exposure to neonicotinoids, especially since the chemicals have time to build up in her body due to her long lifespan [22],[7]. By changing the laying rate to 900 eggs per day, the indirect effect that using neonicotinoid pesticides has on hive survival is shown. Eventually, if the queen continues to lay eggs at the low rate of 900 eggs per day as opposed to a healthy 15002000 eggs per day [5], the hive will collapse due to lack of brood to hatch into workers to sustain the population, displayed in Fig. 7. Noting that the y-axis (bees) of Fig. 7 begins in the negative range, all bee populations fall off relatively quickly. Thereâ&#x20AC;&#x2122;s a brief spike in hive bees near the beginning due to rapidly decreasing field bee numbers and increasing food stores, but the hive population steadily decreases once the situation stabilizes. In reality, the food stores would not grow quite like in the graph. This model is not meant to perfectly replicate reality, but rather, to give a general sense of the effects of factors that negatively affect bees.

Model 2 Ideal Hive

Figure 7. Queen laying deciency availability at 50%

62 | 2015-2016 | Volume 5

3.2.4 Low Laying Rates and Low Availability With laying rate L at 900 and food availability at 0.4, or 40% of maximum food gathering capability, the field

Mathematics and Computer Science Research and hive bees drop off rapidly, while the brood numbers increase very slowly in Fig. 8. Food stores actually increase marginally. This is logical, as the hive dies off very slowly and still has the capability to build up food stores, as evidenced in Fig. 8. In general, after the initial drops or increases, in Fig. 8, there isnâ&#x20AC;&#x2122;t a lot of change in population size. This is because all populations have reached around the same levels, and arenâ&#x20AC;&#x2122;t fluctuating in response to each other.

Figure 8. Laying rate at 900, food availability at 40 3.3 Analysis In this paper, two systems dynamics models were constructed, one focusing on population dynamics of honey bee hives when hive bees transition prematurely into eld bees (Social Inhibition Model), and the Food Dependency Model focusing on the effect that food stores and collection can have on a colony (Model 2). The differential equations for both models were obtained from Khoury et al. in their 2011 and 2013 papers, with the exception of the food collection equation c = a*cmax, which was adapted from Russell et al. Different values were used for the populations and various variables based upon the authors own research. The models were created with the modeling software Vensim [1]. With Model 1, the effects of high forager mortality on hive populations is shown. As would be expected, colony collapse quickly ensues as forager bee population mortality rates increase, as seen in Fig. 3. Precocious foraging due to a shortage of field bees, modeled in Fig. 4, will cause the colony to shut down at a noticeably higher rate than a colony with no precocious foraging. High rates of death for field bees may result from a number of sources, including the pesticide class neonicotinoid, which is commonly sprayed in the vicinity of beehives- especially beehives engaged in commercial agriculture pollination [3]. Neonicotinoids have more short term effects than long term, but the effects can be deadly for honey bees. After consuming food treated with neonicotinoids, bees experience disorientation, loss of navigational memory, and lowered communication abilities [7]. This, in conjunction with the viruses and parasites transmitted to bees via the Varroa mite, including the Nosema parasite, Deformed Wing Virus, and Chronic Bee Paralysis Virus [21], all of which cause death or inability to fly, could

Street Broad Scientific Volume 1 | 2011-2012

result in Colony Collapse Disorder. Foragers might lose their way back to the hive, unable to relocate the colony, or foragers may die outside of the hive, leaving low forager populations and thus food shortages, and social inhibition would decrease, further accelerating the decline of the hive as precocious foraging begins, as shown in Fig. 4. These factors together may result in Colony Collapse Disorder, the main symptoms of which are a lack of worker bees in the hive but a queen and brood still present, along with food stores [17]. A combination of low food stores, a low laying rate due to a neonicotinoid aected queen, seen in Fig. 8, results in very few worker bees with brood numbers higher, but only marginally so, and food stores still increasing very slowly, which is a similar situation to that seen in cases of CCD. These models are also helpful to beekeepers who may want to determine how much honey one can safely remove from a hive without disturbing the health of the colony.

4. Conclusion Honey bees, as well as bees of all types, are an incredibly important part of the global economy, contributing an estimated $200 billion dollars in agricultural pollination worldwide annually [23]. However, they face an overwhelming number of threats, from pesticides to foreign parasites like the Varroa mite. Pressures on the hive can lead to premature transition from hive bee to forager bee, leading to higher death rates. Climate change affects availability of food, also contributing to hive failure. Viruses and parasites can wreak havoc upon a colony, especially when the viruses affect the queen and her egg laying capabilities. As modeled in this paper, bees require field bee longevity in order to maintain a hive, as well as suffcient area to forage. Although some may argue that bees are making a recovery due to increased worldwide population and lower cases of CCD in recent years [15], worldwide numbers are trending upwards because countries and Asia and Africa are only recently adding their numbers to the worldwide tallies [24]. Cases of CCD may be decreasing in the United States, but year-round hive losses are still abnormally high at 42.1% for April 2014-2015 in the US [15], more than double the suggested sustainability rate of 18.7% [25]. Some improvements to the models used may include more rates that involve beekeepers, such as how human interaction may improve colony survival, or how the removal of honey from the hive will affect the colony. Seasonability and seasonal behaviour should also be accounted for in future models, as well as death rates in the brood and hive bees. Many of the plights plaguing bees are either caused by humans or can be stopped by human intervention, and as citizens of a global world with a rapidly increasing population, we owe it to not only the bees but to ourselves to decrease hive losses and preserve pollinators for our Volume 5 | 2015-2016 | 63

Street Broad Scientific Volume 1 | 2011-2012

Mathematics and Computer Science Research

mutual benefit. This world would not be able to function without the presence of the honey bee, and it is up to the citizens of the planet to conserve such a precious resource.

5. Acknowledgements The author would like to thank Mr. Bob Gotwals for being an incredible teacher and providing the author with the background knowledge to be able to produce these models. The author would also like to thank Amalan Iyengar and Xiaoyan Qi for mostly helpful advice on mathematical concepts. Lastly, the author would like to thank NCSSM for offering Introduction to Computational Science as a class. 6. References [1] Ventana Systems, Inc.. (2006). Vensim PLE Software, Ventana Systems Inc.. Retrieved at May 30, 2008, from the website temoa : Open Educational Resources (OER) Portal at [2] Khoury, David S., Andrew B. Barron, and Mary R. Myerscough. “Modelling Food and Population Dynamics in Honey Bee Colonies.” PLoS ONE 8.5 (2013): n. pag. PLOS. Web. 13 Jan. 2016. [3] Mohan, Georey. “Bees Threatened by a Common Pesticide, EPA Finds.” Los Angeles Times. Los Angeles Times, 6 Jan. 2016. Web. 13 Jan. 2016. [4] Khoury, David S., Mary R. Myerscough, and Andrew B. Barron. “A QuantitativeModel of Honey Bee Colony Population Dynamics.” PLoS ONE 6.4 (2011): n. pag. Web. 13 Jan. 2016. [5] Food and Agriculture Organization of the United Nations. “Ch05.” Ch05. FAO, n.d. Web. 13 Jan. 2016. [6] Bessin, Ric. “VARROA MITES INFESTING HONEY BEE COLONIES.” Varroa Mites in Honey Bee Colonies. UKAG, n.d. Web. 13 Jan. 2016. [7] Lundin, Ola, Maj Rundlf, Henrik G. Smith, Ingemar Fries, and Riccardo Bommarco. “Neonicotinoid Insecticides and Their Impacts on Bees: A Systematic Review of Research Approaches and Identication of Knowledge Gaps.” PLOS ONE PLoS ONE 10.8 (2015): n. pag. Web. 13 Jan. 2016 [13] Barron, Andrew. “A Social Analysis of Honey Bee Colony Failure.” Understanding Colony Collapse. The Hermon Slade Foundation, n.d. Web. 14 Jan. 2016. [14] “Why We Need Bees.” Bee Facts. National Resource Defense Council, 2011. Web.13 Jan. 2016. [15] Kaplan, Kim. “Bee Survey.” ARS. US Department of Agriculture, 13 May 2015. Web. 13 Jan. 2016. [16] Von Englesdorp, Dennis. “Fall Dwindle Disease.” CCD Colony Collapse Disorder (n.d.): n. pag. Web. 13 Jan. 2016. [17] Bush, Michael. “Beekeeping.” Bush Bees, Foundationless Frames, Top Bar Hive, Queens, Survivor Bees, Long Hives, Natural Cell Size, Small Cell Bees, Small Cell Beekeeping, Regression, Natural Beekeeping, Mi64 | 2015-2016 | Volume 5

chael Bush. Bush Bees, 2006. Web. 14 Jan. 2016. [18] Blackiston, Howland. “Tracking the Life Cycle of a Honey Bee.” Beekeeping for Dummies. 2nd ed. N.p.: JohnWiley and Sons, 2009. N. pag. Beekeeping for Dummies. Web. 13 Jan. 2016. [19] Western Beekeepers. Bee Schedule. 2014. Web. 13 Jan. 2016. [20] SITNFlash. “To Bee or Not to Bee: Social Dynamics Impact Productivity and Stress Response in Honey Bees.” Science in the News. Harvard University, 1 Apr. 2015.Web. 13 Jan. 2016. [21] Moore, Phillip A., Michael E. Wilson, and John A. Skinner. “Honey Bee Queens: Evaluating the Most Important Colony Member - EXtension.” Honey Bee Queens. Cooperative Extension System, 18 Aug. 2015. Web. 13 Jan. 2016. [22] Sandrock, Christoph, Matteo Tanadini, Lorenzo G. Tanadini, Aline Fauser-Misslin, Simon G. Potts, and Peter Neumann. “Impact of Chronic Neonicotinoid Exposure on Honeybee Colony Performance and Queen Supersedure.” PLoS ONE 9.8 (2014): n. pag. PLOS. Web. 12 Jan. 2016. [23] Syngenta. “Bee Facts.” Bee Facts. Syngenta Canada Inc., 2014. Web. 14 Jan. 2016. [24] Vanengelsdorp, Dennis, and Marina Doris Meixner. “A Historical Review of Managed Honey Bee Populations in Europe and the United States and the Factors That May Aect Them.” Journal of Invertebrate Pathology 103 (2010): n. pag. ScienceDirect. Web. 13 Jan. 2016. [25] Wilson, Michael. “Colony Loss 2014-2015: Preliminary Results.” Bee Informed Partnership. Bee Informed Partnership, 13 May 2015. Web. 13 Jan. 2016.

Street Broad Scientific

Mathematics and Computer Science Research

Volume 1 | 2011-2012

An Analysis of Recursive Properties on Counting Independent Sets of Select Graphs Peter Cheng & Caleb Cox ABSTRACT The goal of our work is to count the number of independent sets in generalized graphs based on recursive properties from continuously adding vertices. The results include formulas for generalized “Dumbbell”, “Lollipop”, “Haircomb”, “Spoke”, Cycle, and Path graphs. Interesting patterns were found that connect Path graphs with the Fibonacci series, and Cycle graphs with the Lucas numbers. As we analyzed more complex graphs, the generalized forms of the sub-graphs were paramount to recursively counting the entire graph. The methods used to arrive at our formulas also demonstrate exactly how independent sets are formed when new vertices are added to maintain a similar, but larger structure. Finally, the result of our generalized “Haircomb” and “Spoke” graphs lay the foundation for counting the number of independent sets in generalized Petersen graphs.

1. Preliminaries Definition 1. Any set of k vertices, 0 ≤ k ≤ n on an undirected graph G with n vertices is said to be an independent set if all vertices within the set are non-adjacent to each other. Definition 2. A linear graph with n vertices, composed of two terminal vertices and n - 2 vertices of degree 2, is a path graph denoted as Pn.

Figure 1. A linear path graph Pn Definition 3. A cycle graph is a graph that consists of a single cycle with n vertices such that each vertex has degree two.

Figure 3: A planar lollipop graph Ln,k Definition 5. A graph is called a “dumbbell” graph if it consists of two cycle graphs having the number of vertices n and m respectively, which are connected by a path graph with k vertices. A “dumbbell” graph formed in this way is denoted as Bn,k,m.

Figure 2: A planar cycle graph Cn

Figure 4: A planar dumbbell graph Dn,k,m

The (n,k)-lollipop graph is usually described as a special type of graph consisting of an n-complete graph and an k-path graph connected to a vertex on the complete graph. Here however, we have our definition of lollipop graph by using an n-cycle graph instead of a n-complete graph.

Definition 6. For any given graph G, denote the total number of independent sets contained in graph G as S(G).

Definition 4. A graph is a “lollipop” graph if it consists of an n-vertex cycle graph and a k-vertex path graph, where k>1. A “lollipop” graph is denoted as Ln,k.

2. The Total Number of Independent Sets in Path, Cycle and Lollipop Graphs To study the recursion property on the total number of independent sets of a complex graph, we first investigate the simplest graph: a path graph. The result resembles the recursion property of Fibonacci’s series. Lemma 1. For a path graph with n vertices v1, v2, v3, ..., vn, the total number of independent sets of Pn is the sum Volume 5 | 2015-2016 | 65

Street Broad Scientific

Mathematics and Computer Science Research

Volume 1 | 2011-2012

of total independent sets of its sub-graphs Pn-1 and Pn-2, such that: S(Pn) = S(Pn-1) + S(Pn-2)


Proof. Refer to the figure above. First, we exclude vertex vn from Pn, resulting a path graph Pn-1 with n-1 vertices. Let the total number of independent sets contained in Pn-1 be S(Pn-1). Next, we include vertex vn to graph Pn-1 by connecting vn to vn-1. The new independent sets in Pn formed by adding vertex vn must contain vn so the number of new independent sets can be derived from the following: We do not consider vertex vn-1 in the new independent sets because the addition of vn will not create any new sets containing vn-1, simply due to the adjacency between vn-1 and vn. Therefore all independent sets containing vn-1 from path Pn-1 will not form new independent sets containing vn. Let s1, s2, s3, ..., si, where i = S(Pn-2) and s1 = 0, be all of the independent sets of Pn-2. We can now create the new independent sets:

(s1, vn), (s2, vn), (s3, vn), ..., (si, vn)

This is clearly true because all the independent sets formed between Pn-2 and vertex vn are formed by adding vn as an element to the independent sets of Pn-2. The total number of new independent sets is therefore S(Pn-2). The number of independent sets excluding vertex vn is S(Pn-1), and the number of independent sets including vertex vn is S(Pn-2). Combining these two, we have S(Pn) = S(Pn-1) + S(Pn-2) Lemma 1 is thus proved. To further explore the recursion property on the total independent sets of more complicated graph, we need to investigate such property on a cycle graph. The following proposition shows the recursive property on the number of total independent sets among a cycle graph and its path graphs. Lemma 2. For a cycle graph Cn with n vertices v1, v2, v3, ..., vn, the total number of independent sets of Cn is the sum of total independent sets of its sub-graphs Pn-1 and Pn-3,

S(Cn) = S(Pn-1) + S(Pn-3)


where Pn-1 and Pn-3 are sub-graphs of Cn. Pn-3 is with n-3 vertices of v2, v3, ..., vn-2 and Pn-1 is with n-1 vertices of v1, v2, v3, ..., vn-1. Proof. First start with a cycle graph Cn-1 shown in Figure 5 and let the total number of independent sets contained in 66 | 2015-2016 | Volume 5

Cn-1 be S(Cn-1). Then we add vertex vn to the graph Cn-1 by removing the edge between v1 and vn-1, forming an edge between vn-1 and vn and forming an edge between v1 and vn. This forms the cycle graph Cn. We have two observations at this point. The new independent sets formed by adding vertex vn must contain vn and cannot contain vn-1 or v1 because vn is adjacent to vn-1 to v1. Therefore the new independent sets formed with vn contain vertices from v2 to vn-2, or a path graph Pn-3. The number of independent sets in Cn that contain vn is equal to S(Pn-3). The independent sets in cycle graph Cn not containing vn contain vertices v1 to vn-1, so the number of independent sets in Cn not containing vn is equal to S(Pn-1). Lemma 2 is thus proved.

Figure 5: A planar cycle graph with n vertices Our further study shows the recursion property on the number of total independent sets of a cycle graph with n vertices can also be expressed recursively on that of its cycle graphs Cn-1 and Cn-2. Lemma 3. For a cycle graph with n vertices v1, v2, v3, ..., vn, the total number of independent sets of Cn is the sum of total independent sets of its sub-graphs Cn-1 and Cn-2, such that:

S(Cn) = S(Cn-1) + S(Cn-2)

where Cn-1 and Cn-2 are sub-graphs of Cn. Proof. We can easily obtain the following equations from Lemma 2,

S(Cn-1) = S(Pn-2) + S(Pn-4) S(Cn-2) = S(Pn-3) + S(Pn-5)

Adding the two equations, we have S(Cn-1) + S(Cn-2) = S(Pn-2) + S(Pn-3) + S(Pn-4) + S(Pn-5) Using the recursion property from Lemma 1 to simplify the right hand of the equation, we arrive at:

S(Cn-1) + S(Cn-2) = S(Pn-1) + S(Pn-3)

By Lemma 2, S(Cn-1)+S(Cn-2) = S(Pn-1) + S(Pn-3) = S(Cn), therefore the lemma is true. We have studied the recursion property for path, cycle graphs, a more complicated graph consists of these two type of graphs, which is called â&#x20AC;&#x153;lollipopâ&#x20AC;? graph. We now turn our investigation towards this kind of graph.

Street Broad Scientific

Mathematics and Computer Science Research Lemma 4. Given a lollipop graph Ln,k with a n-vertex cycle and a k-vertex path, the total number of independent sets in Ln,k can be expressed as,

S(Ln,k) = S(Ln,k-1) + S(Ln,k-2)


Proof. First, let’s consider the lollipop graph Ln,k shown in Figure 6, and remove vertex vn+k to construct lollipop graph Ln,k-1. The total number of independent sets in Ln,k-1 is S(Ln,k-1). Next, add vn+k back onto Ln,k-1 to reconstruct Ln,k. The new independent sets formed with vertex vn+k can only include vertices from v1 to vn+k-2, because vn+k and vn+k-1 are adjacent. Therefore the new independent sets are all the independent sets of Ln,k-2 with a new element or vertex: vn+k. The total number of new independent sets formed by adding vn+k is therefore S(Ln,k-2). Summing S(Ln,k-1) and S(Ln,k-2), the total number of independent sets in Ln,k is then,

S(Ln,k) = S(Ln,k-1) + S(Ln,k-2)

Therefore Lemma 4 is proved.

Volume 1 | 2011-2012

S(Ln,2) = 2S(Cn) + S(Pn-1)

From Lemma 4, we recursively sum the two lollipops resulting in, S(Ln,3) = 3S(Cn) + 2S(Pn-1) S(Ln,4) = 5S(Cn) + 3S(Pn-1) S(Ln,5) = 8S(Cn) + 5S(Pn-1) ............ S(Ln,k) = ak+1S(Cn) + akS(Pn-1) Therefore, Theorem 1 is proved. Theorem 2. The total number of independent sets in a path graph Pk is the (k+2)th term of Fibonacci’s series and the total number of independent sets in a cycle graph Cn is the sum of the (n-1)th and the (n+1)th terms of the Fibonacci’s series, also known as the Lucas numbers.

S(Pk) = ak+2


and S(Cn) = an+1 + an-1

Figure 6: A lollipop graph with n + k vertices

3. Fibonacci’s Series and the Total Number of Independent Sets of Path, Cycle, and Lollipop Graphs Theorem 1. The total number of independent sets of lollipop graph Ln,k can be expressed as,

S(Ln,k) = ak+1S(Cn) + akS(Pn-1)



is the kth term of Fibonacci’s series.

Proof. First, examine the lollipop graph Ln,k, referring Figure 6. The number of independent sets from v1 to vn is simply the number of independent sets in Cn or S(Cn). Adding in vertex vn+1, all new independent sets containing vn+1 can only be made with vertices from v1 to vn-1, because vn+1 and vn are adjacent. The number of new independent sets containing vn+1 is therefore S(Pn-1). So the total number of independent sets in Ln,1 is,

S(Ln,1) = S(Cn) + S(Pn-1)

Repeating the process above for lollipop graph Ln,2 results in


where is the (k+2)th term of Fibonacci’s series, and , are the (n+1) th and the (n-1)th term of Fibonacci’s series respectively. Proof. Consider the cycle graph Cn and the lollipop graph Ln,k shown in figure 6. The number of independent sets in Ln,k is the sum of the independent sets in Cn and the new sets formed by connecting a path Pk to Cn. Adding a path Pk onto cycle Cn produces new independent sets in two ways: (1): Vertices from vn+2 to vn+k, or the path Pk-1, are able to form new independent sets with the independent sets in cycle graph Cn. The union of the independent sets in Pk-1 and independent sets of Cn is expressed as: S(Cn)*S(Pk-1). (2): The new independent sets that contain vertex vn+1 can only contain vertices from v1 to vn-1 and vertices from vn+3 to vn+k because vn+1 is adjacent to vn and vn+2. The path graphs that correspond to the specied vertex ranges are Pn-1 and Pk-2, respectively. The union of the independent sets of Pn-1 and Pk-2 is expressed as: S(Pn-1)*S(Pk-2). Combining (1) and (2), the total number of indepedent sets in lollipop graph Ln,k is expressed as: S(Ln,k) = S(Pk-1)*S(Cn) + S(Pk-2)*S(Pn-1) Volume 5 | 2015-2016 | 67

Street Broad Scientific

Mathematics and Computer Science Research

Volume 1 | 2011-2012

S(Pm-2). Because vertices from v1 to vn+k-1 and vertices from vn+k+1 to vn+k+m-2 are nonadjacent sub-graphs, the new independent sets is exactly S(Pm-2)*S(Ln,k-1). Therefore the total number of independent sets in Dn,k,m is given by eq. (7).

Comparing the result with eq. (4) of Theorem 1, S(Pk-1) = ak+1 therefore S(Pk) = ak+2. Using this result and calling on Lemma 2,

Corollary 1. The total number of independent sets in Dn,k,m can be expressed by the terms of Fibonacci’s series as, S(Dn,k,m) = an+1(ak+m+1 + ak+1am) + an-1(ak+m + akam)

S(Cn) = an+1 + an-1


Therefore, Theorem 2 is proved.

Proof. Through Theorem 1, Theorem 2, and Theorem 3, equation (8) of Corollary 1 is true.

4. Recursion Properties of The Total Number of Independent Sets in Dumbbell Graph

Theorem 4. The total number of independent sets in Dn,k,m is given by,

We have investigated the sub-graphs of dumbbell graphs: paths, cycles and lollipops, and obtained some preliminary results in section 2. A dumbbell graph consists of all the these graphs, therefore a foundation is laid for our further investigation on the total number of independent sets of a dumbbell graph. Theorem 3. For a dumbbell graph Dn,k,m, the total number of independent sets in Dn,k,m can be expressed as, S(Dn,k,m) = S(Ln,k+m-1) + S(Pm-2)S(Ln,k-1)


Proof. Consider constructing dumbbell graph Dn,k,m from lollipop graph Ln,k+m-1, resulting in Figure 7. We add in vertex vn+k+m to Ln,k+m-1, and delete the edge between vertices vn+k and vn+k+m-1, and lastly create a edge between vertices vn+k and vn+k+m. We analyze how new independent sets are formed from Ln,k,m-1 to Dn,k,m.

S(Dn,k,m) = S(Dn,k,m-1) + S(Dn,k,m-2)


Proof. Directly from theorem 3, we have, S(Dn,k,m-1) = S(Ln,k+m-2) + S(Pm-3)S(Ln,k-1) S(Dn,k,m-2) = S(Ln,k+m-3) + S(Pm-4)S(Ln,k-1) From lemma 4 and lemma 1, we add the two results: S(Dn,k,m-1) + S(Dn,k,m-2) = S(Ln,k+m-1) + S(Pm-2)S(Ln,k-1) The right hand side of the equation above is equal to S(Dn,k,m), thus theorem 4 is proved.

5. Recursive Methods in Counting Independent Sets of other Complex Graphs After studying simple graphs and their recursive properties, we extend the techniques of recursion to count the number of independent sets in more complicated graphs.

Figure 7: Dumbbell graph Dn,k,m All the independent sets in Dn,k,m that do not contain vertex vn+k+m are the independent sets in Dn,k,m containing only vertices from v1 to vn+k+m-1, because vn+k+m-1 and vn+k+m are adjacent. This is equal to S(Ln,k+m-1). The new independent sets formed by adding vertex vn+k+m can only contain vertices from v1 to vn+k-1 and vn+k+1 to vn+k+m-2, because vn+k+m is adjacent to vn+k and vn+k+m-2. Therefore the number of independent sets from v1 to vn+k-1 is S(Ln,k-1). In addition, the number of sets from vn+k+1 to vn+k+m-2 is 68 | 2015-2016 | Volume 5

5.1 More Preliminaries Definition 7. A graph is a “Haircomb” graph if it is a tree with 2n vertices where n vertices are leaves, n-2 vertices have degree 3, and 2 vertices have degree 2. We defined the Haircomb graph with the notation Hn for a Haircomb graph with 2n vertices. An example of the structure of a Haircomb graph is shown in Figure 8. Definition 8. A “Spoke” graph is defined as a graph that contains 2n vertices where n vertices form a cycle and each vertex on the cycle is adjacent to a lone vertex. More informally, a spoke graph is the cyclic version of a haircomb graph. A spoke graph with 2n vertices is denoted by Zn. An example of a spoke graph is provided in figure 9.

Mathematics and Computer Science Research

Figure 8: Haircomb graph Hn Lemma 5. The total number of independent sets contained in a Haircomb graph Hn can be expressed as:

S(Hn) = 2*S(Hn-1) + 2*S(Hn-2)

Proof. Begin by analyzing the Haircomb graph Hn-1. Now add vertex vn and its corresponding spoke vertex onto Hn, creating Haircomb Hn. 1 (1): The number of independent sets existing before adding vertex vn and its spoke vertex is simply S(Hn-1).

Street Broad Scientific Volume 1 | 2011-2012

Corollary to Lemma 5. The total number of independent sets contained in a Haircomb graph Hn can also be expressed as:

Proof. This is derived by solving the second order linear dierence equation in Lemma 5 with initial conditions f(3) = 22 and f(4) = 60. Lemma 6. The total number of independent sets contained in a Spoke graph Zn can be expressed as: S(Zn) = 2*S(Hn-1) + 4*S(Hn-3) Proof. Begin by examining the Spoke graph Zn. Removing vertex vn and its corresponding spoke vertex results in the Haircomb graph Hn-1. We now reconstruct vertex vn and its spoke vertex onto Hn-1 to reform Zn, and we analyze the creation of new independent sets.

Figure 9: S(Hn-1) independent sets prior to adding vn and its spoke. (2): The new independent sets formed strictly with vertex vn are formed by the following vertices: the isolated spoke vertex of vertex vn-1 and all vertices and corresponding spokes from v1 to vn-2. The number of new independent sets formed strictly containing vn is therefore S(P1)  S(Hn), or 2*S(Hn-2). 2 Figure 12: Spoke graph Zn Figure 10: Two Sub-graphs of vertices vn can form independent sets with. (3): The new independent sets formed by adding vertex vnâ&#x20AC;&#x2122;s corresponding spoke vertex is equivalent to the number of independent sets in Haircomb Hn-1 because the vn spoke vertex is only adjacent to vn so it can form independent sets with all vertices and corresponding spoke vertices from v1 to vn-1.

(1): Because vertex vn is adjacent to v1 and vn-1, vertex vn forms new independent sets with vertices and corresponding spokes from v2 to vn-2, and with the isolated spoke vertices of v1 and vn-1. The addition of vertex vn thus forms [S(P1)]2*S(Hn-3) or 4*S(Hn-3) new independent sets. (2): The spoke vertex of vn forms new independent sets with vertices and corresponding spokes from v1 to vn-1. The spoke vertex of vn thus forms S(Hn-1) new independent sets. The total number of independent sets in Spoke graph Zn is therefore equal to 2*S(Hn-1) + 4*S(Hn-3).

Figure 11: Graph of vertices vn spoke can form independent sets with. Adding the results, the total number of independent sets of Haircomb graph Hn is therefore S(Hn) = 2*S(Hn-1) + 2*S(Hn-2), thus Lemma 5 is proved.

Lemma 6 is proved.

Acknowledgements We wish to express our gratitude to Dr. Teague and Ms. Gann for guiding us throughout our research. Volume 5 | 2015-2016 | 69

Street Broad Scientific

Mathematics and Computer Science Research

Volume 1 | 2011-2012

Non-S-Figurate Numbers Peter Cheng & Vinit Ranjan & Kelly Zhang ABSTRACT The triangular numbers include 1, 3, 6, 10, 15 .... These are the figurate numbers corresponding to a triangle. The motivation behind our research was to discover a formula that would output the non-figurate numbers, the natural numbers remaining after removing the figurate numbers corresponding to a certain shape. The non-figurate numbers were found using patterns observed among figurate numbers. We then analyzed and proved several mechanisms by which non-figurate numbers increase in succession. The procedural methods to find the non-figurate numbers were then converted into an explicit formula for the kth non-s-figurate number. Regarding usage of figurate numbers, the current applications of s-figurate numbers are used in computing probabilities, adding finite sums of objects, and some applications in iterations of computer programs. On the other hand, non-figurate numbers do not yet serve a purporse, but is rather much like the `monsterâ&#x20AC;&#x2122; function, a mathematical inquiry with no current application.

1. Definitions s-figurate number: The sequence of numbers corresponding to the number of points in a s-gon series like the ones below:

Figure 1: The 3-figurate numbers above are 1, 3, 6, 10, 15, 21, ....

Figure 2: The 5-figurate numbers above are 1, 5, 12, 22, .... non-s-figurate number: A sequence of all the positive integers not included in the sequence of s-figurate numbers. k: Will generally refer to the position of a non-figurate number s: Will generally refer to the number of sides on the regular planar polygon of which the figurate numbers are based off of. F(s,n): The function that produces the nth s-figurate number. F(s; k): The function that produces the kth non-s-figurate number. This was the function sought after. Binary Oscillator : The binary oscillator of some number is the ceiling function of the number minus the floor function of the same number. If the number is an integer, then this will output zero. Otherwise, this will output one. Gap: The subsets of numbers that are bounded by two consecutive s-figurate numbers. 70 | 2015-2016 | Volume 5

In this example, integers from 1 through 10 are shown with the triangular numbers crossed out. The result is the first 6 non-triangular numbers. The gaps here are the subsets [2], [4, 5], and [7, 8, 9]. Each subset of integers that is between the crossed out numbers is a gap. Endpoint of a Gap: The k value that corresponds to the non-s-figurate number that is exactly one less than an sfigurate number. Could also be known as the k value of the last integer in a gap. Adding the number of gaps, denoted as n, to k and adding 1 gave correct non-s-gurate values in our observations. A tentative equation we found was k + n + 1. An exception arose when the sum of the gap sizes was exactly k. In an exception like this, we did not have to add 1 to k +n. These observations formed the fundamental basis for the general formula by: 1) Developing an algebraic method to find the number of gaps before k for any s and k. 2) Forming the basis for the requirement of a binary oscillator to correct `perfectâ&#x20AC;&#x2122; values of k by 1. This requires a method to determine n.

2. Formula for General s-Figurate Numbers Since the formula for the non-s-figurate numbers is based on the formula for the s-figurate numbers, there is a need to use the formula for these numbers. The general formula for the nth s-figurate number is:


Proof of General s-figurate Number Formula

Figure 3: The increase from the nth step of a triangular number to the n+1th is equal to n+1. From figure 1, it can be seen that the n + 1th iteration

Street Broad Scientific

Mathematics and Computer Science Research

Volume 1 | 2011-2012

in the triangular numbers adds in n dots plus 1.

Figure 4: The increase from the nth step of a square number to the n + 1th is equal to 2n + 1. It can be seen that on each iteration, for a fixed s, the number added for the next number is equal to n(s-2)+1. For this fixed s value, this formula can be proved by the process of mathematical induction. Let Pj be the statement that, for a fixed s,

Basis: F(s,1) = 1 for any s, which is true. Inductive Step: Assume that the formula is true for some value k, namely that

Now consider Pk+1. It needs to be shown that

Before, it was shown that, on each successive iteration, k(s-2)+1 is added. So, if this is added to F(s, k), the following happens:

By manipulating the variables, the following can be written, and then further simplication produces the desired results:

Figure 5: The recursive nature of the s-figurate numbers. On any iteration an+1, the number of dots is equal to the previous number, an, plus additional dots. Consider the figures in terms of the number of dots per side. In these figures, it can be seen that each successive figure builds off of the previous figureâ&#x20AC;&#x2122;s shared vertex. The dots that are added on can be considered as two dierent types. The circled dots can be thought of as vertices that are added to the original. So, the number of additional dots added are s - 1 shared vertices. In addition, the remaining number of dots is equal to n - 1 per each non-shares side, of which there are s - 2. So, the result is that an+1 = an + (n-1)(s-2) + (s-1) In order to find the size of the gap, the difference in an+1 and an is needed minus 1 to discount the endpoints. an+1 - an - 1 = an + (s-1) + (n-1)(s-2) - an - 1 an+1 - an - 1 = s - 1 + (n-1)(s-2) - 1 an+1 - an - 1 = (s-2)n Therefore, the subsets of integers between any two sfigurate numbers increase at a rate dependent on s and n. In order to see this change from our previous formula for the nth s-figurate number, the partial derivative of F(s, n) can be taken: This shows us that the difference between two consecutive s-figurate integers is equal to some multiple of s - 2, which is used in our formula.

3. General Non-s-Figurate Formula

This is exactly Pk+1, which shows that the formula holds, for a fixed s, by the process of mathematical induction. 2.2

Argument for Increase in Gap Size


3.1 Endpoints of Gaps One thing to notice is that the position k of the last non-figurate number of gap x can be written as . This is due to the fact that the size, or number of non-s-figurate numbers, of gaps can be summed to find the last non-s-figurate number of any gap. This is simply the sum of the sizes of all complete gaps and the final gap that k is in. Volume 5 | 2015-2016 | 71

Street Broad Scientific Volume 1 | 2011-2012

Mathematics and Computer Science Research

For example, look at a small list of non-triangular numbers: The position corresponding to value 9 in the list is 6. This is also equal to the sum of the complete gaps and the final gap: 1+2+3=6 3.2 Constructing the General Non-s-Figurate Formula To arrive at the number of complete gaps, n, before k, we solve the equation that sums the gaps for any s:

We chose the positive root when using the quadratic formula because we are interested in counting the maximum number of complete gaps, n, which should be positive. In order to find the number of complete gaps, the floor function can be used on the RHS, guaranteeing that n is maximized. The result is

4. The Mechanisms of the General Non-sFigurate Formula We used three proofs to show the three mechanisms of (1) by which the non-s-figurate numbers increase in the non-s-figurate number list. The first proof proves that within gaps, the incremental increase from one non-sfigurate number to another is 1 (we ignore the last nons-figurate numbers of gaps in this proof because they increase by a separate mechanism). The second proof deals with the mechanism by which the second to last non-sfigurate number increases to the last non-s-figurate number in a gap. The third and final proof shows the jump from the last non-s-figurate number in a gap to the first non-s-figurate number in the next gap. 4.1 Within the Gap We can show the mechanism for increasing between non-s-figurate numbers with non-integer values corresponding with k and k + 1. This progression is the range of non-s-figurate numbers that occur from the first non-s-figurate number in a gap to the second to last non-s-figurate number in the gap. We proceed by proof by induction. Basis: The k values of 4 and 5, both produce non-integer values of .

However, when k is the last point of a gap, it will account for itself when calculating the floor of the function. This results in a need to subtract a final 1 to the end of the function. To account for this, we created a binary oscillator that gives 0’s when k is perfectly summed by n gaps, and gives 1’s when k is not perfectly summed. This binary oscillator replaces the 1 in the original k + n + 1 statement as it is possible for the 1 to come from the floor function itself.

This is the binary oscillator that satisfies our conditions. We know that k must be in the form

which is an endpoint of a gap, to make because plugging k into results in:

an integer

Our final answer for the k-th non-s-figurate number is the original position k plus the number of complete gaps preceding k plus the binary oscillator: 72 | 2015-2016 | Volume 5

Let Pj be the statement that the value of a non-s-figurate number defined in our range is Inductive Step: Assume k within our specied range is true. Within our specied range, the increase from one nons-figurate number to another is 1, which is a trivial observation. Therefore we expect k + 1’s non-s-figurate number to be equal to


Let us analyze k + 1 in statement Pj by plugging k + 1

The binary oscillators in both and give results of 1 because of our range of k’s that give non-integer values of . We see an increase in 1 in from k to k + 1. This difference results from the change of the j term from k to k + 1. This conrms the difference in 1 between all consecutive non-s-figurate numbers defined in our range.

Street Broad Scientific

Mathematics and Computer Science Research However, we must still show that the for k and k + 1 result in the same value. We know that k and k + 1 lie between two 2 endpoints of gaps, so: From this inequality, we can conclude that

Therefore a gap.

experiences no change in a middle of

4.2 Second-Last Non-s-gurate number to End of Gap We proceed based on our argument that the non-sfigurate number at the end of a gap is at position , for all positive integers x. Plugging this into (1) results in: Let us observe k - 1, the position corresponding to the second to last non-s-figurate number in a gap, which is equal to - 1 . Plugging this into (1) results in:

Let us observe the values of these two non-s-figurate numbers comparatively by parts. The binary oscillator for k results in 0 because 4x2 +4x+1 is a perfect square. The binary oscillator for k - 1 results in 1 because 4x2 +4x+1is not a perfect square with our s values of interest. There is a decrease in 1 in the binary oscillator from k - 1 to k. must be 1 greater than because 4x2 + 4x + 1 is a perfect square whereas 4x2 +4x+1will never equal a perfect square because is never sufficiently large to reach 4x2. So is some non-integer strictly greater than 2x and strictly less than 2x + 1. Therefore is exactly x, and is exactly x - 1. We see an increase in 1 in the term from k - 1 to k. The first terms of the two non-s-figurate representations are for k and +1 for k - 1. Therefore we see an increase in 1 in the first term from k - 1 to k. Combining the results from above, the mechanism occurs by an increase in 1 just by increasing k - 1 by 1, an increase in 1 because k results in integer value, and a decrease by 1 in the binary oscillator because is integer for k. Overall, the increase from the second-last non-s-figurate number in a gap to the last non-s-figurate

Volume 1 | 2011-2012

number in the same gap is 1. 4.3 End of Gap to Start of Gap Now we consider the jump of a non-s-figurate number at the end of a gap to the non-s-gurate number at the start of the next gap. In theory, the difference should be 2 because the s-figurate number is skipped over. We use that the position of the last non-s-figurate number in a gap is where x is any positive integer. Therefore the k + 1 position corresponds to the rst non-s-figurate number in the next gap. k + 1 is equivalent to +1. Plugging +1 into (1), the non-s-figurate number corresponding to position k + 1, the result is:

Plugging into (1), the non-s-figurate number corresponding to position k, the result is: We now analyze the difference between the non-sfigurate representations of k and k + 1. Comparing the first term of both non-s-gurate representations, it is clear that +1 is 1 greater than . The term in k+1â&#x20AC;&#x2122;s non-s-figurate representation will always be equal to the term in kâ&#x20AC;&#x2122;s non-s-figurate representation. This is due to the fact that the additional in the radical, observed trivially with our s values of interest, will never be large enough to increase 4x2 + 4x + 1 + to the next endpoint, which is would have a value under the radical of 4x2 + 12x + 9. Comparing the binary oscillator of both non-s-figurate representations shows an increase in 1 from k to k + 1. This is due to the fact that the binary oscillator produces 1 for k + 1 because is non-integer, as shown before. Meanwhile, the binary oscillator for k produces a 0 because is an integer. As a result, the change in the binary oscillator from k to k + 1 is 1. Summing the total changes in the value of the nons-figurate number from position k, the end of a gap, to position k + 1, the beginning of the next gap, the result is 2 as expected.

5. Alternate Formulas From the base k + n + 1 formula, an alternate method to determine n can be used. This involves looking at the number of s-gurate numbers preceding k as opposed to the number of complete gaps. This can be done using the formula found for the general s-gurate numbers. However, this method is not always true as accounting for the number of s-gurate numbers before k plus one accounts for the last s-gurate number that exists in the Volume 5 | 2015-2016 | 73

Street Broad Scientific Volume 1 | 2011-2012

Mathematics and Computer Science Research

list. This cannot simply be fixed by a binary oscillator because other values of the function will be shifted. This results from the issue that multiple values of k within in a gap are shifted. However, this method still has some merit to it as the nal formula can be changed to account for the shift in output values. So, let us proceed by this method. Similarly to before, we need to find the maximum value of n that is less than k, using the formula for the general s-figurate number.

bounded by two perfect squares, which can be expressed in terms of some integer g, as g2 and (g + 1)2. Any integer in this subset can be written as:

Using a similar manipulation of the floor function, we arrive at

We know that since r is strictly less than 2g +1, the maximum value of r is 2g. If we substitute for our final simplication of t, we see

The issue with this formula is the shift of k at dierent values. So, the formula in this exact form does not actually work. However, this formula is useful as we used it in other experimentation and manipulation, and we arrived at alternate formulas for non-triangular and non-square numbers. The alternate formula for the non-triangular numbers is

We know that t is strictly less than 1 but could be exactly equal to 0. This would make the fraction above 1. But, t cannot be 1. t must be less than 1, so if we remove t from the denominator, the fraction will be guaranteed to be greater than t. From this deduction, we find that

We could not strictly prove this formula, but it comes from shifting the floor function plus one to a ceiling function. The difference in the formula is a result of the difference in endpoint values of the ceiling and floor functions. In addition, we found the alternate formula for nonsquare numbers through similar means, but with a key difference: We were also able to prove this formula. 5.1 Proof of Alternate Formula for Non-Square Numbers We will proceed by the process of mathematical induction. Basis: The first non-square number is 2.

Inductive Step: Assume the above formula is true for some p-1. Now let us consider the formula for the number p. Consider the set of integers where p lies, which is 74 | 2015-2016 | Volume 5

g2 + r, 0 ≤ r < 2g+1

where r is also some integer. Now let us consider a sum g + t. This means 0 ≤ t < 1.


Consider 0 ≤ r < 2g + 1 as two subintervals:

0≤r≤g g < r < 2g+1

On the subinterval where 0 ≤ r ≤ g, t ≤ from the inequality . On the subinterval where g < r < 2g + 1, it can be seen from the same inequality , that t > . A side note to consider here is that the case where t = is impossible because that implies that (g+t)2 = g2+r . We know that g and r are both integers, and any number with a decimal of squared cannot result in an integer. So, the t = can be discarded. Now consider Recall that r can still be 0, so this allows us to input all integers.

On the interval 0 ≤ r ≤ g because

= g + t, with t < as seen before. So, = g. On the interval g < r < 2g + 1,

Mathematics and Computer Science Research

Street Broad Scientific Volume 1 | 2011-2012

through similar reasoning as t > causing there to be an extra 1 in the function’s output. From the equation on the interval 0 ≤ r ≤ g, we see that ranges from g2 2 + g to g + 2g, by plugging in the endpoints of the interval. For the second interval, the values that are used as input must be g + 1 and 2g, as the boundaries on the interval are strict. This means that ranges from g2 + 2g + 2 2 to g + 3g + 1. On this range of possible output, from g2 + g to g2 + 3g + 1, the only perfect square is g2 + 2g + 1 = (g + 1)2, which is successfully ignored in the function’s output. Consider the next subset of values from (g + 1)2 to (g + 2)2. From the first equation found for the range of values on the inner interval 0 ≤ r ≤ g, the next value of the function is equal to (g + 1)2 + g + 1 which is equal to g2 + 3g + 2. This continues outputting integers where the last subset stopped. This chain continues for all p and only outputs non-square integers. Therefore the formula holds by the process of mathematical induction.

6. Conclusion and Future Work The reason for our research was to determine an explicit formula for non-s-figurate numbers. Through an analysis of the s-figurate numbers, we were able to construct the general formula of the kth non-s-figurate number. This construction was made based on gaps and the discovery of the binary oscillator for positive integers. An alternative method, although shown unsuccessful for general cases, was also used to find shorter formulas for, specically, the non-triangular and non-square numbers. For future work, we would like to continue using our alternate formula to see if we can find some constants for each value of s, similar to the for the case s = 4. We would also like to expand on the notion of non-figurate numbers to include 3-dimensional figures with regular polyhedra. This could possibly also be expanded to higher dimensions. In addition, we would like to be able to find a practical application for the non-figurate numbers in general.

7. Acknowledgements First of all, we are grateful to our instructor Dr. Daniel Teague for assigning the problem. In addition, we would like to thank Ms. Julie Graves for serving as our adviser on the problem.

Volume 5 | 2015-2016 | 75

Street Broad Scientific Volume 1 | 2011-2012


Feature Article: An Interview with Maya Ajmera

Left: Ms. Maya Ajmera - President and CEO of Society for Science and the Public, NCSSM Class of 1985 Right: Sicheng Zeng, BSS Chief Publication Editor, Amhad Askar, BSS Essay Contest Winner, Rishi Sundaresan and Nimit Desai, BSS Chief Editors, Jonathan Bennett, BSS Faculty Advisor What do you think is the biggest misconception people have about science today? One of the biggest problems that affect a lot of kids in this country is the perception that science is hard and it isn’t fun. It is easier if I just don’t participate. . But, I think everything we do and touch is all related to science. I think we need to do a better job of promoting STEM at earlier ages—two and three year olds! On a national scale, science has become political. Science is used as a political tool to sway people from making the best choices about their livelihoods, as witnessed with the measles vaccine—the outbreak at Disneyland was scary. A lot of misinformation was promoted especially online. The web has really given us a lot of knowledge, but at the same time there is inaccurate information found online. Hence, one needs to be very careful about what they read. Do you think there is enough of a push given to high school students to pursue research in STEM? No, there is not enough emphasis nationally to encourage high school students to pursue STEM research. I was very lucky. I had the opportunity to do bench research in a botany lab when I was 13 years old. I was fortunate to grow up in Greenville, North Carolina that is home to East Carolina University. I had the opportunity to work with a great scientist and conduct basic research. I started competing in science competitions as well. I think NCSSM is a rare gem with the mentorship program. I am starting to see many magnet schools and academies build mentorship programs. NCSSM was out in front of this movement when the school was founded. When I was a student in mid 80s, the mentorship program at NCSSM was an important part of my life. We must do more to engage our alumni to become STEM mentors for the next generation. I think the mentorship network is not as strong as it can be. We need to do a better job of encouraging our alumni at NCSSM and other institutions to pay it forward and become STEM mentors for the next generation. With competitions such as the Science Talent Search and the International Science & Engineering Fair, what are you and the Society for Science & the Public trying to accomplish (other than spreading STEM of course)? The Science Talent Search (STS) was founded in 1942. STS is celebrating its 75th anniversary this year. We have had only two sponsors in our 75 year history—Westinghouse and Intel. The Society for Science & the Public (Society) is the best in the world in finding the extraordinary talent who will be our future STEM leaders. These high school seniors are 76 | 2015-2016 | Volume 5

Street Broad Scientific


Volume 1 | 2011-2012

not only conducting groundbreaking research but they are incredibly well-rounded. Each year we honor three hundred of the best and brightest young scientists in this country and from that we select 40 finalists. NCSSM boasts hundreds of these students. In fact, I was actually the first young woman to be honored in the Westinghouse Science Talent Search in North Carolina. With the Intel International Science & Engineering Fair (ISEF) , we bring nearly 2,000 students from 80 countries to showcase their science projects in over 25 categories to compete for $4 million in awards. It is the most powerful talent pipeline in science and engineering in the world. We really look at ISEF as the place for invention and innovation, while STS rewards students for groundbreaking basic research. At the Society, I am also publisher of the award winning magazine, Science News. The Society’s founder, E.W. Scripps, founded this magazine nearly 100 years ago. Science News is a bi-weekly magazine providing up to date scientific and technical developments. It also has a digital magazine that is read by over 12 million people. We have one of the few science newsrooms left in the United States. Our journalists all have advanced scientific degrees in their chosen fields and a journalism degree, too. I believe, Science News and other science magazines and new stories are one of the bedrocks of a flourishing democracy. You mentioned you did a research project in high school. How do you think scientific research in high school has changed since you were a student? Technology has changed everything. For example, the biological sciences are at a critical inflection point with technology. Bioinformatics is using computers to store and analyze large data sets and make sense of it. This just was not possible when I was in high school. Data was collected manually in a lab book. How has NCSSM helped your career path and aspirations? NCSSM gave me the courage to continue a journey of exploration and risk taking. All NCSSM students are risk takers. We are 16 years old, and we are leaving home. We are going to a publicly funded residential high school that is very rigorous academically but also gives students the opportunity to be creative and innovative. The mentorship program at NCSSM was monumental for me. It gave me an experiential learning opportunity at Duke University, a world class research institution. Regardless of the career path you take, science imparts lasting lessons. There has been an emphasis in our society that it is not ok to make mistakes. In science, when an experiment does not work, it is how we get answers. It pushes scientists to think differently about a problem. Secondly, integrity is at the heart of scientific research. This value is so important in everything we do. The third aspect is collaboration. I was very fortunate to see scientific collaborators in the labs I worked in. Today, it is very important for me to solve problems in a collaborative way. With science, regardless of the path you take and the work you do, the core values you learn from being a scientist carry you in any facet of life. Looking back on your journey, what part of your Science and Math experience would you like to see changed or added? I would love to see every student be in a research lab. I don’t think that has happened yet at NCSSM, but I think that would be extraordinary. I think it is important for NCSSM to have more research labs, too. The arts play an important role in STEM. It is called STEAM. I think it is vital for all students to have some grounding in the arts. Many creative people in technology and science have had a strong influence from the arts. I don’t know how much you have read about Steve Jobs, but when he dropped out of Reed College, he decided to drop in on a Calligraphy class. That class actually influenced him a lot. He created the most beautiful fonts in Apple devices. As Mr. Jobs has said, you wouldn’t have known that that calligraphy class would have had any influence at all because it is hard to connect the dots looking forward, but easy to connect them looking backwards. You don’t want to be dismissive of any academic discipline, because you never know how it will influence your journey. More specifically, what activities could you do that would improve your presentation in writing? Science is complex and it can be hard to explain. We need to have writing classes that focus on writing about science for the public. How does your research make the world a better place? How does your research influence public policy? Good Volume 5 | 2015-2016 | 77

Street Broad Scientific Volume 1 | 2011-2012


writing helps to bridge the divide. What other qualities must a scientist, researcher, or entrepreneur have to be successful? Every kid at NCSSM is brilliant. Three skills that are very important in our fast paced changing world are becoming an eloquent writer and speaker, working well in teams, and listening and understanding diverse points of view. Going off your entrepreneurial experience, what inspired you to start The Global Fund for Children? I majored in biology with a concentration in neuroscience at Bryn Mawr College. I received a Rotary International Graduate Fellowship to travel to South and Southeast Asia. During my travels, I began seeing homegrown innovations taking place in local communities. In India, I got off at a train station in a city called Bhubaneswar where I saw forty or fifty kids sitting in a circle on the platform; they were learning how to read and write with a teacher holding flashcards in the middle of the circle. I learned that the kids actually lived around the train platform. The kids worked, played, slept and begged on the train platform but they did not go to school. A teacher noticed this and decided to bring the school to the children. That was my moment of obligation. I realized I wanted to help groups of kids like these and scale innovative programs for poor children globally. I decided to put off medical school and get my graduate degree in public policy at Duke University. I took courses in economics, international development, and education. When I was 24 years old, I founded The Global Fund for Children with $25,000 in seed capital. Twenty years later, we have invested nearly $40 million in capital to innovative grassroots organizations serving the most vulnerable children and youth in the world including the train platform schools. I also founded a children’s book publishing imprint with Charlesbridge Publishing. We have over thirty books in the marketplace that celebrate the similarities that all children share around the world. Recently, I just released my first children’s book in the area of science. It is called Every Breath We Take. It is written with Dominique Browning, co-founder of the Moms Clean Air Force of the Environmental Defense Fund. I am hoping to write more children’s books with a scientific bent. In starting a company, entrepreneurs will face hardships and obstacles to overcome. At these points, in your opinion, how should they decide whether to call it quits or keep going with the product? When I founded The Global Fund for Children, there were times when I wanted to quit. However, I was fortunate to have great mentors who encouraged me to keep going. One of the biggest pieces of advice I would give to young entrepreneurs is to really surround themselves with people who believe in you and stay away from people who suck the energy and passion out of you, or who constantly are naysayers. Do not confuse this advice with helpful mentors, who ask you tough questions to guide your thinking. There is a big difference. In addition, an entrepreneur usually comes to a point in their own journey when they know it’s time to call it quits. Sometimes the reason is simply that the enterprise has run out of money, etc. Even if the enterprise does not succeed it is not a bad thing, because you learn from the experience. Failure can be a great learning opportunity. Do you have any other advice for young students and entrepreneurs? All students need to have a global orientation. In fact, I would support a requirement for all NCSSM students to travel abroad. There are many ways you can learn from overseas experiences. You could visit research labs such as the Max Plank Institute in Germany, or you could volunteer in the developing world and teach English to children for the summer. Experiential learning overseas exposes young people to be more aware and open to diverse peoples and cultures. Finally, there are a lot of “scientifically illiterate” people out there who do not believe or trust the data. How do we teach them the value and importance of science? It has to start with young people. Then we have to make sure the educators and school systems are reaching out to students and providing accurate scientific information. I believe this is one of the biggest challenges facing our country today. How do we make sure that accurate scientific information gets into the hands of our citizens? At the Society for Science & the Public, we have begun a new program called Science News in High School that gets our magazine free of charge into science departments across the country through a sponsorship model. 78 | 2015-2016 | Volume 5


Street Broad Scientific Volume 1 | 2011-2012

I think that if we can get a lot more kids doing research, inventing new technologies and applications, and trying to find solutions to intractable problems, this will be an important step in the right direction.

Volume 5 | 2015-2016 | 79

Questions? Comments? Submissions?

Profile for broadstreetscientific

Broad Street Scientific Volume 5  

The Broad Street Scientific publishes research from students at The North Carolina School of Science and Mathematics.

Broad Street Scientific Volume 5  

The Broad Street Scientific publishes research from students at The North Carolina School of Science and Mathematics.