Page 1

Glacial Flooding & Disaster Risk Management Knowledge Exchange and Field Training July 11-24, 2013 in Huaraz, Peru

The Great  Himalaya  Trail  Pilot  Study  on  Plant  Species   Distribution  –  A  Citizen  Science  Initiative   Paribesh  Pradhan1,  Rajan  Bajracharya2 1. Annapurna  Foundation, 2. International  Center  for  Integrated  Mountain  Development  (ICIMOD), Abstract:   The  Hindu  Kush-­‐Himalayan  (HKH)  region  is  rich  in  global  biodiversity  hotspots,  eco-­‐ regions,  bird  areas,  plant  areas,  and  Ramsar  Sites.  The  conservation  of  species  begins  with   an  understanding  of  the  distribution,  abundance,  habitat  preferences  and  movements  of   organisms  across  wide  geographic  areas  and  over  long  periods  of  time,  apart  from  its   association  with  the  lives  of  people,  their  culture  and  traditions,  and  goods  and  services   offered.  We  propose  a  citizen  science  initiative  that  combines  human  observations  along   the  trail  with  geo-­‐spatial  informatics  to  inform  environmental  changes  in  the  HKH  region.   This  paper  explores  different  aspects  of  developing  a  crowd-­‐sourcing  application  for   environmental  monitoring  of  HKH  region.  This  initiative  is  intended  to  address  not  only   biodiversity  issues  but  also  other  thematic  issues  in  future  such  as  hazards,  vulnerability,   and  adaptation  and  coping  case  studies.   Keywords:   Hindu  Kush  Himalaya,  the  Great  Himalaya  Trail,  Biodiversity,  Crowd  Sourcing,  Citizen   Science   Introduction:  The  Great  Himalaya  Trail   “The  Great  Himalaya  Trail  (GHT)  –  My  Climate  Initiative”  was  initiated  in  2012  by  Paribesh   Pradhan  with  financial  support  from  the  Global  Programme  Climate  Change  (GPCC),  Swiss   Agency  for  Development  and  Cooperation  (SDC).  The  project  entailed  walking  from  east  to   west  of  Nepal,  a  distance  of  1555  KM,  in  98  days  along  the  Great  Himalaya  Trail  (GHT)  to   document  communities’  perception  of  change  and  stories  of  sustainable  adaptation   practices,  vulnerabilities  and  impacts  of  climate  change.  Given  the  nature  of  the  journey,  it   opened  up  possibilities  to  photograph  large  number  of  species  and  habitat.  Over  500   unique  high  resolution  geo-­‐tagged  photographs  of  plant  species  were  thus  taken  as  a   voluntarily  initiative  over  the  stretch  of  GHT.  These  photographs  are  now  being  used  to   develop  a  pilot  geospatial  database  application  on  species  distribution  along  the  GHT   together  with  the  International  Center  for  Integrated  Mountain  Development  (ICIMOD).   Such  application  is  of  critical  importance  particularly  in  the  Hindu  Kush-­‐Himalayan  (HKH)   region  where  biodiversity  has  not  been  fully  documented.    

Importance of  Biodiversity  in  the  Hindu  Kush  Himalaya   Almost  one-­‐third  of  the  HKH  region  is  covered  by  all  or  part  of  4  global  biodiversity   hotspots,  6  UNESCO  Natural  World  Heritage  Sites,  60  eco-­‐regions,  330  important  bird   areas,  53  important  plant  areas  for  medicinal  plants,  and  29  Ramsar  Sites.  [ICIMOD,  2009]   A  wide  variety  of  ecosystems  support  specialized  biodiversity  with  many  globally   threatened,  endemic,  and  migratory  species.  The  conservation  of  species  begins  with  an   understanding  of  the  distribution,  abundance,  habitat  preferences  and  movements  of   organisms  across  wide  geographic  areas  and  over  long  periods  of  time,  apart  from  its   association  with  the  lives  of  people,  their  culture,  and  traditions,  goods  and  services   offered.  This  pilot  citizen  science  initiative  will  help  in  understanding  different  aspects   involved  to  develop  crowd  sourcing  application  for  environmental  monitoring  of  HKH   region  to  address  various  other  thematic  issues  such  as  hazards,  vulnerability,  and   adaptation  and  coping  case  studies  to  climate  change.  It  will  also  help  understand  different   aspects  involved  to  develop  crowd  sourcing  application  for  environmental  monitoring  of   HKH  region.         Approach:  A  Citizen  Science  Initiative  using  Crowd  Sourcing  Tools     Engaging  communities  and  citizens  to  take  photography  in  mass  scale  for  the  purpose  of   understanding  science  and  nature  is  still  a  relatively  new  and  evolving  approach.  However,   a  lot  of  initiatives  are  taking  place  all  over  the  world  to  this  effect.  Galaxy  Zoo  has  involved   more  than  200,000  participants  to  classify  more  than  100  million  galaxies  through  web   enabled  interface  (Wood  et  al.  2011).  Game-­‐based  engagement  FoldIt  attempts  to  predict   protein  structure  by  utilizing  humans’  puzzle  solving  abilities  (ibid).  Wikipedia,  which   allows  users  to  add  or  edit  definitions  of  any  article,  is  an  example  of  a  successful  model  on   a  large  scale.  eBird  collects  about  5000  checklists  and  75,000  observations  each  day  that  all   go  into  a  single  standard  database;  in  2011,  eBird  contributors  volunteered  more  than  1.3   million  hours  collecting  bird  observations  (Hardin,  2012).       Similarly,  National  Biodiversity  Network  in  the  UK  has  over  31  million  records  of  plant  and   animal  species  largely  submitted  by  amateur  naturalists  (Stafford  et  al.,  2010).  In  Australia   there  are  large-­‐scale  citizen  science  projects  mapping  distributions  of  species  as  diverse  as   possums,  whale  sharks  and  frogs  (ibid).  National  Science  Foundation’s  Data  ONE,  EURING   bird  ringing  and  recovery  scheme,  India  Biodiversity  Portal,,,  Plantwise,  geowiki  are  a  few  more  examples.  Popular  photo  data   collection  websites  such  as  Flickr  and  Pinterest  along  with  generic  ones  also  provide   categories  such  as  science  and  nature.       Citizen  science  projects  stem  from  the  pervasive  access  to  the  geospatial  informatics   comprising  of  remote  sensing,  Geographical  Information  System  (GIS)  and  information   technology  consisting  of  internet  and  mobile  systems.  It  leverages  the  potential  of  these   technologies  in  data  collection,  data-­‐management,  quality  control,  data  processing,   analysis,  serving  the  information  and  applications  and  to  develop  Human/Computer   Learning  Networks  (HCLN).  These  networks  can  leverage  the  contributions  of  broad   recruitment  of  human  observers  and  process  their  contributed  data  with  Artificial  

Intelligence (AI)  algorithms  for  a  resulting  total  computational  power  far  exceeding  the   sum  of  their  individual  parts.     A  wide  variety  of  ecosystems  support  specialized  biodiversity  with  many  globally   threatened,  endemic,  and  migratory  species,  but  the  biodiversity  has  not  been  fully   documented  in  the  HKH  region.  There  is  a  limited  availability  of  data  and  to  fill  this  data   gap,  citizen  science  initiatives  using  crowd-­‐sourcing  techniques  could  be  a  cost  effective   and  the  most  efficient  approach.  The  term  crowd-­‐sourcing  was  first  coined  by  Jeff  Howe  in   an  article  of  Wired  magazine  (J.  Howe.,  2006).       However,  crowd-­‐sourcing  data  are  difficult  to  structure  as  disorganized  crowd  based   content  such  as  text,  images  and  video  hinders  in  the  management  of  ecological   information  system.  In  addition,  there  are  few  well-­‐established  repositories  or  standard   protocols  for  their  archiving  and  retrieval  of  ecological  observation  data  (Madin  et  al.,   2011).  That  means  a  researcher  investigating  a  particular  case  has  to  struggle  with   retrieval  from  disorganized  crowd  base  content  and  similarly  has  to  investigate   heterogeneous  repositories  in  order  to  obtain  the  data  needed.  Hence  there  is  a  great   administrative  effort  for  knowledge  extraction  (data  discovery),  consolidation  from   unstructured  data  and  integration  processes  with  other  data  repositories  which  hinders   research  activities.       The  ontology  provides  a  convenient  basis  for  adding  detailed  semantic  annotations  to   scientific  data,  and  extended  with  specialized  domain  vocabularies,  making  it  both  broadly   applicable  and  highly  customizable  (Madin  el  al.,  2011).  The  development  of  semantic   platform  can  address  these  problems  by  creating  a  smart  content  using  semantic   (ontology)  based  knowledge  management  and  retrieval  system.  To  develop  such  semantic   platform  that  captures  semantically  rich  crowd  source  ecological  datasets  (text,  image,  and   video),  the  following  steps  have  to  be  considered:     • A  mechanism  or  development  of  engine  to  identify  the  concept  or  ‘domain  patterns’   (e.g.  topology,  dryness,  landcover)  in  the  crowd  source  data  and  map  semantics  to   those  concepts.   •

Add value  to  existing  foundational  ontologies  (domain/ecological  patterns)  and   enrich  the  semantic  in  the  knowledge  schema.      

Development of  the  wrappers  which  will  lift  data  from  the  original  sources  to  the   meaningful,  machine-­‐readable  level.  Example  are  the  Google  Art  wrapper  (C.  Guéret,   2011)  

Smart content  and  advance  user  interface  for  presenting  ecological  knowledge  base.   This  includes  merging  of  the  several  ontologies  provides  one  semantically  rich   access  point  for  the  entire  domain  crowd  sourced  data  and  relevant  data   repositories  in  order  to  achieve  an  integration  of  the  resources.  

The  following  conceptual  pillars  are  necessary  to  implement  this  citizen  science  initiative.  

Semantic Platform:     The  semantic  web  will  provide  a  way  to  package  data  with  its  meaning  as  smart  content   using  the  Topic  Maps  technology.  Figure  1  shows  the  conceptual  framework  of  the   semantic  web  platform.  Topic  Maps  is  an  international  industry  standard  (ISO/IEC  13250,   2003)  for  knowledge  representation  and  information  integration.  It  provides  the  ability  to   store,  together  with  the  data,  complex  meta-­‐data  that  represents  the  semantics  i.e.  record   the  meaning  of  the  data  stored.  Unlike  other  technologies  (e.g.  RDF/OWL),  Topic  Maps   provides  the  ability  to  represent  knowledge  in  a  natural  way  –  the  way  humans  grasp   knowledge.  This  is  a  more  natural  approach  that  can  be  extremely  powerful  especially   when  humans  must  interact  with  information  systems.     All  the  types  in  a  topic  map  –  the  topic  types,  the  occurrence  types,  the  association  types   and  the  role  types  –  are  defined  as  topics.  These  topics  provide  the  conceptual  skeleton  of   the  topic  map.  These  topics  together  with  the  scoping  topics  are  referred  to,  in  the  Topic   Maps  community,  as  the  Topic  Maps  Ontology.  Ontologies  are  very  useful  when  authoring   topic  maps  as  they  help  to  identify  the  borders  of  the  domain  of  knowledge  that  the  topic   map  represents.  The  Topic  Maps  standard  provides  the  ability  to  merge  topic  maps  in   order  to  achieve  an  integration  of  the  resources  (Bleier  et.  Al.,  2010).     The  proposed  platform  integrate/federate  crowd-­‐sourced  data  to  self  organize  and   wrapper  application  wraps  other  heterogeneous  data  to  extend  ontologies  and  content   from  different  distributed  sources  providing  one  access  point  using  domain  vocabularies.   This  will  transfer  the  data  into  Smart  Content.                                                   Smart  Content  Layer:     In  order  to  allow  the  data  to  present  itself  according  to  its  meaning  and  within  context,  the   layer  should  able  to  provide  a  Topic  Map  Application  Programming  Interface  (TMAPI,   2010)  for  creating  and  using  Semantically  Active  Components  (SACs).  A  SAC  is  a   component  that  enables  the  presentation  or  the  processing  of  data  by  the  platform.   Examples  for  SACs  are  a  component  that  presents  data  in  a  table,  component  that  presents   a  graph  or  a  component  that  sends  an  email  when  certain  condition  related  to  the  data  is  

met. By  implementing  the  API  the  layer  defines,  any  SAC  provide  the  ability  to  be   configured  by  the  semantics  of  the  data  and/or  the  context  in  which  the  data  is  being   accessed  (the  role  of  the  user,  his  activity,  or  objectives).  Moreover,  it  will  be  possible  to   nest  SACs  within  other  SACs.  This  allows,  for  example,  creating  a  web  presentation  for  all   the  topics  of  certain  type  accessed  by  certain  user  in  certain  situation.  The  fact  that  the  data   is  self  explanatory,  and  the  effect  that  the  data  semantics  and  the  context  in  which  it  is   accessed  has  over  its  presentation  or  processing  make  it  Smart  Content.       Advanced  User  Interfaces  Layer:     The  semantic  platform  will  include  a  natural  language  user  interface  and  will  provide  a  way   to  access  the  data  by  asking  questions  and  conducting  dialogs  with  the  system.  This  enables   the  users  to  easily  perform  semantically  rich  queries.  While  the  natural  language  user   interface  lets  the  user  ask  queries,  a  graphical  user  interface  will  allow  the  user  to  browse   through  the  available  knowledge  and  data.  This  graphical  user  interface  will  integrate  the   Semantically  Active  Components  in  order  to  visualize  different  types  of  data  in  different   ways.  In  other  ways  it  synergizes  existing  informatics  resources  using  more  user-­‐friendly   integrative  UIs.   Pilot  Study  from  the  Great  Himalaya  Trail   All  the  photographs  from  GHT  were  taken  in  RAW  format  using  Canon  7D  camera.  The   photographs  had  to  be  preprocessed  and  converted  into  low  resolution  JPEG  format   compatible  for  web  publishing  purposes.  The  preprocessing  also  included  creating  a   database  and  analysis  to  associate  the  photos  with  the  data  received  from  the  GPS  device.  A   category  of  attributes  were  also  identified  while  preparing  this  database.  The  second  step   will  involve  identification  of  all  plant  species  in  the  photos  by  a  taxonomist,  thus  providing   added  value  information  to  the  database.  As  an  example,  the  following  is  the  data   information  of  Darimpate  plant,  also  known  scientifically  as  Rosa  sericea.     1   Photo  ID   IMG_4294   2   Scientific  Name   Rosa  sericea   3   Local  Name/s   Darimpate     4   Family  Name     5   Photographed  Date   7  May  2012   6   Time   1:12:08PM   7   Altitude   2825  m   8   Altitudinal  Range   1820  -­‐  4850  m     9   Longitude   87.9058861   10   Latitude   27.4832412607   11   Photograph  By   Paribesh  Pradhan       12   District   Taplejung   13   Plant  Features   Deciduous  shrub,  1-­‐2  m  tall;  Stems  smooth  or  bristly  or   with  robust  red  thorns,  sometimes  wing-­‐like,  paired  below   leaves  or  scattered  along  branches;  leaves  pinnate,  leaflets   ovate-­‐obovate;  margin  entire  at  base,  serrate  towards  

14 Common  Habitat   15   Regional  Distribution   16   Remarks   17   Economic  use   18   Endemic  value  

apex; flowers  solitary  on  short  side  shoots,  white  to   creamy-­‐yellow,  4  petaled;  fruit  a  hip,  red,  obovoid  –   globose     Open  woods,  forest  margins,  scrub,  dry  sunny  places   India  (Sikkim,  Assam),  Nepal,  Bhutan,  Myanmar,   Tibet/China     -­‐   Generis/medicinal/…   -­‐  

  The  third  and  most  crucial  step  will  be  the  development  of  semantic  web  platform  and   integration  of  this  database.  The  system  will  be  developed  as  described  in  the  proposed   conceptual  frame  of  semantic  web  platform  above.    

Figure 2.  Geotagged  photographs  from  the  Great  Himalaya  Trail.  Map  Courtesy:  ICIMOD   Conclusion   An  application  based  on  semantic  web  platform  for  GPS  enabled  smart  phones  and  similar   other  device  to  collect  photo  data  of  different  plant  species  is  under  development  as  a  part   of  the  pilot  project.  Some  of  the  technical  challenges  constraining  the  current  development  

are automatic  pre  annotation  of  the  data  by  the  system  before  being  fed  to  the  central   database  framework  and  also  the  interoperability  among  the  systems.  It  is  also  a  technical   challenge  to  incorporate  AI  that  will  automatically  identify  the  plant  species  thus   eliminating  the  need  for  taxonomist  to  identify  and  approve  each  data  every  time.   However,  this  project  has  the  potential  to  bridge  the  data  gap  on  biodiversity  in  the   Himalayas  for  researchers  and  scientists  in  future.  This  project  could  also  be  replicated  to   other  thematic  areas  as  mentioned  previously  and  may  also  be  useful  as  a  smart   application  for  trekkers  and  hikers  to  identify  the  plant  species  instantly  in  real  time  while   they  are  in  the  mountains.  However,  it  will  require  more  research,  financial  funding  and   time  to  develop  this  into  fully  functional  application  whereby  crowd  could  feed  in  the  data   and  also  use  it  to  get  immediate  information  about  the  plants  they  photographed.       References     Bleier,  Arnim,  Patrick  Jähnichen,  Uta  Schulze,  and  Lutz  Maicher.  2010.  “The  Praxis  of  Social   Knowledge  Federation”,  Presentation  on  Topic  Maps  services  held  at  the  Second   International  Workshop  on  Knowledge  Federation,  in  Dubrovnik,  Croatia.   Guéret,  Christopher.  2011.  “GoogleArt  —  Semantic  Data  Wrapper  (Technical  Update)”,,  March  25,  2011.  Accessed  on  10  June,  2013:­‐semantic-­‐data-­‐wrapper-­‐technical-­‐update_b18726.   Hardin,  Steve.  2012.  “How  to  Identify  Ducks  in  Flight:  A  Crowdsourcing  Approach  to   Biodiversity  Research  and  Conservation“,  The  Information  Association  for  the  Information   Age,  Bulletin  February/  March  2012.  Accessed  on  14  June,  2013:­‐12/FebMar12_Hardin_Kelling.html   Howe,  Jeff.  2006.  “The  rise  of  crowd  sourcing”,  Wired,  Issue  14.06.  Accessed  on  14  June,   2013:   ICIMOD.  2009.  “Mountain  Biodiversity  and  Climate  Change”,  International  Center  for   Integrated  Mountain  Development  (ICIMOD),  Kathmandu,  Nepal.     ISO.  2003.  “Information  Technology  Document  Description  and  Processing  Languages   Topic  Maps”,  International  Organization  for  Standardization  (IOS),  Geneva,  Switzerland.­‐2nd-­‐ed-­‐v2.pdf   Madin,  Joshua,  Shawn  Bowers,  Mark  Schildhauer,  Sergeui  Krivov,  Deana  Pennington,  and   Ferdinando  Villa.  2007.  “An  ontology  for  describing  and  synthesizing  ecological   observation  data”,  Ecological  Informatics,  Volume  2,  Issue  3,  Pages  279-­‐296,  ISSN  1574-­‐ 9541,  Available  at:  

Paribesh Pradhan: The Great Himalayan Trail pilot study on plant species distribution  

The Hindu Kush-Himalayan (HKH) region is rich in global biodiversity hotspots, eco-regions, bird areas, plant areas, and Ramsar Sites. The c...

Read more
Read more
Similar to
Popular now
Just for you