Solr

princy | July 6th, 2011 - 14:16:59

About Solr

  • Based on Lucene, server-ization of Lucene
  • HTTP request processing for index and query
  • Has a web-based administrative interface
  • Configure file and schema file using  XML
  • Faceting of query results
  • Spell checking function
  • More like this function
  • Distributed solr server

Home page

  • http://lucene.apache.org/solr/

Solr books

  • 「Apache Solr 入門」
  • Solr 1.4 Enterprise Search Server

To be continued …

No SQL

princy | July 6th, 2011 - 13:53:04

Find all you want?

  • No SQL http://nosql-database.org/

Some other useful knowledges:

  • Google map reduce  http://labs.google.com/papers/mapreduce.html
  • Google big table  http://labs.google.com/papers/bigtable.html
  • Google file system http://labs.google.com/papers/gfs.html
  • Google Chubby  http://labs.google.com/papers/chubby.html

Hadoop

princy | July 6th, 2011 - 13:40:29

Hadoop:

  • Open source, reliable, distributed computing

Two Core Components:

  • HDFS: Distributed replicated file system, Self-healing high-bandwidth clustered storage, just stores bytes,
  • Map/Reduce: API for parallel computing, Fault-tolerant distributed processing, a batch system

Feature:

  • Hadoop scales linearly with data size or analysis complexity

TO Not NoSQL

  • Hive project adds SQL support to Hadoop
  • HiveQL compiles to a query plan
  • Query plan executes as MapReduce jobs

Hadoop users

  • Yahoo, Facebook, Twitter

Other words

  • Zookeeper – distributed synchronization
  • Avro – Data Serialization / RPC
  • H-BASE –  structured distributed database for horizontally scalable FS

EcoSystem of Hadoop

 

 

Useful links:

http://hadoop.apache.org/common/

http://www.cloudera.com/

http://www.slideshare.net/cloudera/tokyo-nosqlslidesonly

http://www.slideshare.net/xefyr/introduction-to-hadoop-hbase-and-nosql

http://www.slideshare.net/adorepump/hbase-nosql