.. -*- rst -*-

.. highlightlang:: none

.. groonga-command
.. database: tutorial

Basic operations
================

The groonga package provides a C library (libgroonga) and a command line tool (groonga). This tutorial explains how to use the groonga command, with which you can create/operate databases, start a server, establish a connection with a server, etc.

Create a database
-----------------

You can create a new database with the following command.

Form::

  groonga -n DB_PATH_NAME

The '-n' option specifies to create a new database. DB_PATH_NAME specifies the path of the new database. Note that this command fails if the specified path already exists.

This command creates a database and then enters into interactive mode in which groonga prompts you to enter commands for operating that database. You can terminate this mode with Ctrl-d.

Execution example::

  % groonga -n /tmp/tutorial.db
  > Ctrl-d
  %

Operate a database
------------------

Form::

  groonga DB_PATH_NAME [COMMAND]

DB_PATH_NAME specifies the path of a target database.

If COMMAND is specified, groonga executes COMMAND and returns the result. Otherwise, groonga starts in interactive mode that reads commands from the standard input and execute them one by one. This tutorial focuses on the interactive mode.

Let's try to see the status of a groonga process by using a :doc:`/commands/status` command.

.. groonga-command
.. include:: ../example/tutorial/introduction-1.log
.. .. % groonga -n /tmp/groonga-databases/introduction.db
.. status

As shown in the above example, a command basically returns a JSON array. The first element contains an error code, execution time, etc. The second element is the result of an operation.

Command format
--------------

Commands for operating a database accept arguments as follows::

 Form_1: COMMAND VALUE_1 VALUE_2 ..

 Form_2: COMMAND --NAME_1 VALUE_1 --NAME_2 VALUE_2 ..

In the first form, arguments must be passed in order. This kind of arguments are called positional arguments because the position of each argument determines its meaning.

In the second form, you can specify a parameter name with its value. So, the order of arguments is not defined. This kind of arguments are known as named parameters or keyword arguments.

If you want to specify a value which contains white-spaces or special characters, such as quotes and parentheses, please enclose the value with single-quotes or double-quotes.

For details, see also the paragraph of "command" in :doc:`/executables/groonga`.

Basic commands
--------------

 :doc:`/commands/status`
  shows status of a groonga process.
 :doc:`/commands/table_list`
  shows a list of tables in a database.
 :doc:`/commands/column_list`
  shows a list of columns in a table.
 :doc:`/commands/table_create`
  adds a table to a database.
 :doc:`/commands/column_create`
  adds a column to a table.
 :doc:`/commands/select`
  searches records from a table and shows the result.
 :doc:`/commands/load`
  inserts records to a table.

Create a table
--------------

A :doc:`/commands/table_create` command creates a table.

In most cases, a table of groonga has a primary key which must be specified with its data type and index type. 

There are various data types such as integers, floating-point numbers, etc. The index type determines the search performance and the availability of prefix searches. We will explain the details later.

Let's create a 'Site' table which has a primary key of ShortText. In this example, the index type is HASH.

.. groonga-command
.. include:: ../example/tutorial/introduction-2.log
.. table_create --name Site --flags TABLE_HASH_KEY --key_type ShortText

View a table
------------

A :doc:`/commands/select` command shows contents of table.

.. groonga-command
.. include:: ../example/tutorial/introduction-3.log
.. select --table Site

When only a table is specified, the 'select' command returns the first (at most) 10 records of that table. "[0]" in the result shows the number of records in the 'Site' table. The next array is a list of columns. ["_id","Uint32"] is a column of UInt32, named "_id". ["_key","ShortText"] is a column of ShortText, named "_key".

The above two columns, '_id' and '_key', are the necessary columns. The '_id' column stores IDs those are automatically allocated by groonga. The '_key' column is associated with the primary key. You are not allowed to rename these columns.

Create a column
---------------

A :doc:`/commands/column_create` command adds a column to a table.

Let's add a column of ShortText to store titles. You may give a descriptive name 'title' to the column.

.. groonga-command
.. include:: ../example/tutorial/introduction-4.log
.. column_create --table Site --name title --flags COLUMN_SCALAR --type ShortText
.. select --table Site

The COLUMN_SCALAR flag specifies to add a regular column.

Create a lexicon table for full text searches
---------------------------------------------

Let's go on to how to make a full text search.

Groonga uses an inverted index to provide fast full text search. So, the first step is to create a lexicon table which stores an inverted index, also known as postings lists. The primary key of this table is associated with a vocabulary made up of index terms and each record stores postings lists for one index term.

The following shows a command which creates a lexicon table named 'Terms'. The data type of its primary key is ShortText.

.. groonga-command
.. include:: ../example/tutorial/introduction-5.log
.. table_create --name Terms --flags TABLE_PAT_KEY|KEY_NORMALIZE --key_type ShortText --default_tokenizer TokenBigram

The table_create command takes many parameters but you don't need to understand all of them. Please skip the next paragraph if you are not interested in how it works.

The 'TABLE_PAT_KEY' flag specifies to store index terms in a patricia trie. The 'KEY_NORMALIZE' flag specifies to normalize index terms. In this example, both flags are validated by using a '|'. The 'default_tokenizer' parameter specifies a method for tokenizing text. This example specifies 'TokenBigram' that is generally called 'N-gram'.

Create an index column for full text search
-------------------------------------------

The second step is to create an index column, which allows you to search records from its associated column. That is to say this step specifies which column needs an index.

Let's create an index column for the 'title' column in the 'Site' table.

.. groonga-command
.. include:: ../example/tutorial/introduction-6.log
.. column_create --table Terms --name blog_title --flags COLUMN_INDEX|WITH_POSITION --type Site --source title

This command creates an index column 'blog_title' in the 'Terms' table. The '--type' option specifies a target table and the '--source' option specifies a target column. Then, the 'COLUMN_INDEX' flag specifies to create an index column and the 'WITH_POSITION' flag specifies to create a full inverted index, which contains the positions of each index term. This combination 'COLUMN_INDEX|WITH_POSITION' is recommended for the general purpose.

Load data
---------

A :doc:`/commands/load` command loads JSON-formatted records into a table.

The following adds nine records to the 'Site' table.

.. groonga-command
.. include:: ../example/tutorial/introduction-7.log
.. load --table Site
.. [
.. {"_key":"http://example.org/","title":"This is test record 1!"},
.. {"_key":"http://example.net/","title":"test record 2."},
.. {"_key":"http://example.com/","title":"test test record three."},
.. {"_key":"http://example.net/afr","title":"test record four."},
.. {"_key":"http://example.org/aba","title":"test test test record five."},
.. {"_key":"http://example.com/rab","title":"test test test test record six."},
.. {"_key":"http://example.net/atv","title":"test test test record seven."},
.. {"_key":"http://example.org/gat","title":"test test record eight."},
.. {"_key":"http://example.com/vdw","title":"test test record nine."},
.. ]

Let's make sure that these records are correctly stored.

.. groonga-command
.. include:: ../example/tutorial/introduction-8.log
.. select --table Site

Search data
-----------

Before a full text search, let's try to search data by '_id' and '_key'. These columns work as unique keys.

You can search records by using a 'select' command with a 'query' parameter.

.. groonga-command
.. include:: ../example/tutorial/introduction-9.log
.. select --table Site --query _id:1

'_id:1' specifies to search a record whose ID is 1.

Next, let's search a record by a primary key.

.. groonga-command
.. include:: ../example/tutorial/introduction-10.log
.. select --table Site --query "_key:\"http://example.org/\""

'_key:\"http://example.org/\"' specifies to search a record whose primary key is "http://example.org/".

Full text search
----------------

It's time to make a full text search. You can make a full text search query with the same 'query' parameter.

.. groonga-command
.. include:: ../example/tutorial/introduction-11.log
.. select --table Site --query title:@this

This command searches records whose 'title' column contains a string 'this'. In this case, only one record matches this query. Note that the lower case query 'this' matches a capitalized 'This' in the 1st record because 'KEY_NORMALIZE' was specified in lexicon column creation.

The 'select' command has an optional parameter 'match_columns'. This parameter specifies default target columns and it is used when target columns are not specified in a query.[1]_

A combination of '--match_columns title' and '--query this' brings you the same result that '--query title:@this' does.

.. groonga-command
.. include:: ../example/tutorial/introduction-12.log
.. select --table Site --match_columns title --query this

Specify output columns
----------------------

An 'output_columns' parameter in a 'select' command specifies columns to be shown in the search result. If you want to specify more than one columns, please separate column names by commas (,).

.. groonga-command
.. include:: ../example/tutorial/introduction-13.log
.. select --table Site --output_columns _key,title,_score --query title:@test

This command specifies three output columns including the '_score' column, which stores the relevance score of each record.

Specify output ranges
---------------------

A 'select' command returns a part of its search result if 'offset' and/or 'limit' parameters are specified. These parameters are useful to paginate a search result, a widely-used interface which shows a search result on a page by page basis.

An 'offset' parameter specifies the starting point and a 'limit' parameter specifies the maximum number of records to be returned. If you need the first record in a search result, the offset parameter must be 0 or omitted.

.. groonga-command
.. include:: ../example/tutorial/introduction-14.log
.. select --table Site --offset 0 --limit 3
.. select --table Site --offset 3 --limit 3
.. select --table Site --offset 7 --limit 3

Sort
----

A 'select' command sorts its result when used with a 'sortby' parameter.

A 'sortby' parameter specifies a column as a sorting creteria. A search result is arranged in ascending order of the column values. If you want to sort a search result in reverse order, please add a leading hyphen (-) to the column name of a parameter.

.. groonga-command
.. include:: ../example/tutorial/introduction-15.log
.. select --table Site --sortby -_id

You can use the '_score' column as a sorting criteria for ranking a search result.

.. groonga-command
.. include:: ../example/tutorial/introduction-16.log
.. select --table Site --query title:@test --output_columns _id,_score,title --sortby _score

If you want to specify more than one columns, please separate column names by commas. In such a case, a search result is sorted in order of the column values in the first column, and then records having the same values in the first column are sorted in order of the second column values.

.. groonga-command
.. include:: ../example/tutorial/introduction-17.log
.. select --table Site --query title:@test --output_columns _id,_score,title --sortby _score,_id

.. rubric:: footnote

.. [1] Currently, a 'match_columns' parameter is available iff there exists an inverted index for full text search. A 'match_columns' parameter for a regular column is not supported.
