Getting Started
Documentation
July 7, 2014
© 2014 DataStax. All rights reserved.
Contents
Contents
Overview........................................................................................................................... 4
Cassandra......................................................................................................................... 5
Installing Cassandra............................................................................................................................ 5
Installing DataStax Community on RHEL, CentOS, or Oracle Linux....................................... 6
Installing DataStax Community on Debian or Ubuntu..............................................................6
Installing DataStax Community on any Linux platform or Mac OS X from the tarball...............7
Installing DataStax Community on Windows........................................................................... 9
Moving data to or from other databases.......................................................................................... 11
DataStax Enterprise.......................................................................................................12
Installing DataStax Enterprise........................................................................................................... 12
Installing DataStax Enterprise using the GUI-based installer.................................................12
Installing DataStax Enterprise on Linux without root permissions or Mac OS X.................... 16
Other install methods............................................................................................................. 20
DataStax Demos............................................................................................................................... 25
Moving data to or from other databases.......................................................................................... 31
OpsCenter....................................................................................................................... 32
Installing OpsCenter.......................................................................................................................... 32
Installing OpsCenter on RHEL-based systems......................................................................32
Installing OpsCenter on Debian or Ubuntu............................................................................ 32
Installing OpsCenter on any Linux platform or Mac OS X..................................................... 33
Installing DataStax Community on Windows......................................................................... 35
Troubleshooting agent installs................................................................................................40
Creating a cluster using OpsCenter..................................................................................................41
Using DevCenter............................................................................................................ 42
Installing DevCenter.......................................................................................................................... 42
GUI.....................................................................................................................................................42
Connection Manager.............................................................................................................. 43
Query Editor............................................................................................................................45
Schema Navigator.................................................................................................................. 47
Outline.....................................................................................................................................48
Results.................................................................................................................................... 48
CQL Scripts............................................................................................................................ 49
A tutorial on using DevCenter...........................................................................................................50
Key concepts and data model..................................................................................... 51
Querying Cassandra......................................................................................................53
Tips for using DataStax documentation..................................................................... 55
3
Overview
Overview
This guide introduces the latest versions of Cassandra, DataStax Enterprise, and OpsCenter. It shows you
how to install and set up a single-node cluster for evaluation.
You can install the cluster one of two ways:
• Using the OpsCenter GUI.
•
Installing on the command line:
• Cassandra
• DataStax Enterprise
For information about setting up a production cluster, see the Cassandra or DataStax Enterprise
documentation.
Cassandra
An introduction to Cassandra, installation instructions, and moving data to or from other databases.
DataStax Enterprise
An introduction to DataStax Enterprise, installation instructions, moving data to or from other databases,
and the DSE demos.
OpsCenter
A visual management and monitoring solution for Cassandra and DataStax Enterprise. At the end of each
install section is a link to the correct procedure.
DevCenter
A free visual query IDE for developers, administrators, and others who want to create and run Cassandra
Query Language (CQL) statements against Apache Cassandra and DataStax Enterprise.
Key database concepts
A 30 second introduction to key Cassandra terminology.
Cassandra data model
• The data model distilled - a brief introduction to the basic elements of the data model
• Getting Started with Time Series Data Modeling white paper
• Getting Started with User Profile Data Modeling white paper
• Become a Super Modeler webinar
• The Data Model is Dead, Long Live the Data Model webinar
• C* Summit 2013: The World's Next Top Data Model webinar
Querying Cassandra
Quickly master inserting and retrieving data from Cassandra using CQL.
4
Cassandra
Cassandra
What is Apache Cassandra?
Apache Cassandra™ is a massively scalable open source NoSQL database. Cassandra is perfect for
managing large amounts of structured, semi-structured, and unstructured data across multiple data centers
and the cloud. Cassandra delivers continuous availability, linear scalability, and operational simplicity
across many commodity servers with no single point of failure, along with a powerful dynamic data model
designed for maximum flexibility and fast response times.
How does Cassandra work?
Cassandra sports a “masterless” architecture meaning all nodes are the same. Cassandra provides
automatic data distribution across all nodes that participate in a “ring” or database cluster. There is nothing
programmatic that a developer or administrator needs to do or code to distribute data across a cluster
because data is transparently partitioned across all nodes in a cluster.
Cassandra also provides built-in and customizable replication, which stores redundant copies of data
across nodes that participate in a Cassandra ring. This means that if any node in a cluster goes down,
one or more copies of that node’s data is available on other machines in the cluster. Replication can be
configured to work across one data center, many data centers, and multiple cloud availability zones.
Cassandra supplies linear scalability, meaning that capacity may be easily added simply by adding new
nodes online. For example, if 2 nodes can handle 100,000 transactions per second, 4 nodes will support
200,000 transactions/sec and 8 nodes will tackle 400,000 transactions/sec:
10 Minute Cassandra Walkthrough
Planet Cassandra provides a 10 Minute Cassandra Walkthrough where you can download a Cassandra
virtual machine (VMware or VirtualBox). Plus you can take short video courses for developers and
administrators that demonstrate various Cassandra's features.
Cassandra data model
• The data model distilled - a brief introduction to the basic elements of the data model
• Getting Started with Time Series Data Modeling white paper
• Getting Started with User Profile Data Modeling white paper
• Become a Super Modeler webinar
• The Data Model is Dead, Long Live the Data Model webinar
• C* Summit 2013: The World's Next Top Data Model webinar
Installing Cassandra
5
Cassandra
Installing DataStax Community on RHEL, CentOS, or Oracle Linux
Install using a yum repository.
About this task
If you have trouble installing, see the full installation documentation. For information about setting up a
production cluster, see the Cassandra documentation.
Before you begin
• Oracle Java 7 must be installed. To install, see Installing Oracle JRE on RHEL-based Systems.
• Root or sudo access.
• Python 2.6+ (needed if installing OpsCenter).
• 256MB of memory (only for testing light workloads). If using a virtual machine, be sure to use the
recommended memory allocation or more for your operating system.
Procedure
In a terminal window:
1. Add a DataStax Community repository file called /etc/yum.repos.d/datastax.repo.
[datastax]
name = DataStax Repo for Apache Cassandra
baseurl = http://rpm.datastax.com/community
enabled = 1
gpgcheck = 0
Install the packages:
2.
$ sudo yum install dsc20
3. Start DataStax Community (as a single-node cluster):
$ sudo service cassandra start
On some Linux distributions, you many need to use:
$ sudo /etc/init.d/cassandra start
4. Verify that DataStax Community is running:
$ nodetool status
What to do next
Install and set up OpsCenter (Optional).
•
• Key concepts.
• The data model distilled.
• Take the DevCenter tutorial.
• Set up a single or multiple data center cluster.
Installing DataStax Community on Debian or Ubuntu
Install using an APT repository.
6
Cassandra
About this task
If you have trouble installing, see the full installation documentation. For information about setting up a
production cluster, see the Cassandra documentation.
Before you begin
• Oracle Java 7 must be installed. To install, see Installing Oracle JRE on Debian or Ubuntu Systems.
• Root or sudo access.
• Python 2.6+ (needed if installing OpsCenter).
• 256MB of memory (only for testing light workloads). If using a virtual machine, be sure to use the
recommended memory allocation or more for your operating system.
Procedure
In a terminal window:
1. Add the DataStax Community repository to the /etc/apt/sources.list.d/
cassandra.sources.list
$ echo "deb http://debian.datastax.com/community stable main" | sudo tee -
a /etc/apt/sources.list.d/cassandra.sources.list
2. Add the DataStax repository key to your aptitude trusted keys.
$ curl -L http://debian.datastax.com/debian/repo_key | sudo apt-key add -
Install the package.
3.
$ sudo apt-get update
$ sudo apt-get install dsc20
This installs the DataStax Community distribution of Cassandra. The Debian packages start the
Cassandra service automatically.
4. Verify that DataStax Community is running:
$ nodetool status
What to do next
Install and set up OpsCenter (Optional)
•
• Set up a single or multiple data center cluster.
• Key concepts.
• Take the DevCenter tutorial.
• The data model distilled.
Installing DataStax Community on any Linux platform or Mac OS X from the tarball
Use this method to install a single-node cluster on Mac OS X and platforms without package support, or if
you do not have or want a root installation.
7
Cassandra
About this task
If you have trouble installing, see the full installation documentation. For information about setting up a
production cluster, see the Cassandra documentation.
Before you begin
• Oracle Java 7 must be installed. To install, see Installing Oracle JRE on RHEL-based Systems or
Installing Oracle JRE on Debian or Ubuntu Systems.
• Python 2.6+ (needed if installing OpsCenter).
• 256MB of memory (only for testing light workloads). If using a virtual machine, be sure to use the
recommended memory allocation or more for your operating system.
Procedure
In a terminal window:
1. Download and untar the DataStax Community tarball:
$ curl -L http://downloads.datastax.com/community/dsc.tar.gz | tar xz
You can also download from Planet Cassandra.
2. Go to the install directory:
$ cd dsc-cassandra-2.0.x
3. For instructions about installing without root permissions, click here.
4. Start DataStax Community from the install directory:
$ sudo bin/cassandra
5. Verify that DataStax Community is running. From the install directory:
$ bin/nodetool status
What to do next
Install and set up OpsCenter (Optional)
•
• Key concepts.
• The data model distilled.
• Take the DevCenter tutorial.
• Set up a single or multiple data center cluster.
Installing without root permissions
Installing Cassandra when you don't have or want to use sudo or root permissions.
About this task
Before performing this steps, you must have completed steps 1 and 2 in Any Linux system or Mac OS X.
Procedure
1.
In the install directory, create the data and log directories:
8