|
| 1 | +# HBase in Standalone Mode |
| 2 | + |
| 3 | +## 1. Pull the required Docker Images and use them to create and run the Docker Containers |
| 4 | + |
| 5 | +Delete the 4 previous containers then create and run the Docker containers specified in [Docker-Compose.yaml](Docker-Compose.yaml) i.e., |
| 6 | + |
| 7 | +* HBase-Master |
| 8 | +* HBase-Regionserver |
| 9 | +* Zookeeper |
| 10 | + |
| 11 | +## 2. Connect to the HBase Shell hosted in the `hbase-master` Docker Container |
| 12 | + |
| 13 | +HBase Shell is a JRuby-based command-line program you can use to interact with HBase. |
| 14 | + |
| 15 | +```shell |
| 16 | +docker exec -it hbase-master hbase shell |
| 17 | +``` |
| 18 | + |
| 19 | +You can also confirm that HBase is running via its Web-UI: [http://localhost:16010/](http://localhost:16010/) |
| 20 | + |
| 21 | +Execute the following statements in HBase shell: |
| 22 | + |
| 23 | +```shell |
| 24 | +# To show the version of HBase (it should be version 2.1.3) |
| 25 | +version |
| 26 | + |
| 27 | +# To show the details of the servers running HBase: |
| 28 | +# The output according to the setup should be: |
| 29 | +# 1 active master, 0 backup masters, 1 servers, 0 dead, 2.0000 average load |
| 30 | +status |
| 31 | +``` |
| 32 | + |
| 33 | +## 3. Getting Help |
| 34 | + |
| 35 | +To get guidance on a specific command: |
| 36 | + |
| 37 | +```shell |
| 38 | +# Replace COMMAND with the command you want guidance on |
| 39 | +help 'COMMAND' |
| 40 | +``` |
| 41 | + |
| 42 | +For general guidance on how to use table-referenced commands. |
| 43 | + |
| 44 | +```shell |
| 45 | +table_help |
| 46 | +``` |
| 47 | + |
| 48 | +## 4. Create a Table |
| 49 | + |
| 50 | +The table emp has 2 column families: |
| 51 | + |
| 52 | +* personal data |
| 53 | +* professional data |
| 54 | + |
| 55 | +```shell |
| 56 | +create 'emp', 'personal data', 'professional data' |
| 57 | +create 'employee', 'Personally_Identifiable_Information_PII', 'KPI_Appraisal' |
| 58 | + |
| 59 | +create 'wiki', 'text' |
| 60 | +``` |
| 61 | + |
| 62 | +The table `wiki` has 1 column family: |
| 63 | + |
| 64 | +* text |
| 65 | + |
| 66 | +```shell |
| 67 | +create 'wiki', 'text' |
| 68 | +``` |
| 69 | + |
| 70 | +Verify that the table has been created: |
| 71 | + |
| 72 | +```shell |
| 73 | +list |
| 74 | +``` |
| 75 | + |
| 76 | +## 5. View the table's metadata |
| 77 | + |
| 78 | +Execute the following to view the metadata of the created table: |
| 79 | + |
| 80 | +```shell |
| 81 | +describe 'wiki' |
| 82 | +``` |
| 83 | + |
| 84 | +## 6. Insert data |
| 85 | + |
| 86 | +We use the keyword `put` to insert data in HBase. The following statement inserts a new record with the key **`Home`** adding **`Welcome to the wiki!`** to the column family `text:`. If there was a specific column in the column family, then it would be specified as `[column family]:[column]` |
| 87 | + |
| 88 | +```shell |
| 89 | +put 'wiki', 'Home', 'text:', 'Welcome to the wiki!' |
| 90 | +``` |
| 91 | + |
| 92 | +Unfortunately, the `put` command in HBase shell allows you to insert only one column value at a time. |
| 93 | + |
| 94 | +## 7. Retrieve data |
| 95 | + |
| 96 | +We use the keyword `get` to retrieve data from HBase. `get` requires the **table name** and the **row key**. |
| 97 | + |
| 98 | +```shell |
| 99 | +get 'wiki', 'Home', 'text:' |
| 100 | +``` |
| 101 | + |
| 102 | +We use the keyword `scan` to retrieve all the rows. This is compute-intensive for large databases and should be avoided in production. By default, HBase uses the current timestamp when inserting data and the most recent timestamp when retrieving data. |
| 103 | + |
| 104 | +```shell |
| 105 | +scan 'wiki' |
| 106 | +``` |
| 107 | + |
| 108 | +## 8. Altering Tables |
| 109 | + |
| 110 | +Altering tables is computationally expensive because HBase creates a new column family with the chosen specifications and then copies all the data to the new column. |
| 111 | + |
| 112 | +* Disable the table |
| 113 | + |
| 114 | +```shell |
| 115 | +disable 'wiki' |
| 116 | +``` |
| 117 | + |
| 118 | +By default, HBase stores only 3 versions of values (each with a timestamp). But this can be changed as follows: |
| 119 | + |
| 120 | +```shell |
| 121 | +alter 'wiki', { NAME => 'text', VERSIONS => org.apache.hadoop.hbase.HConstants::ALL_VERSIONS } |
| 122 | +``` |
| 123 | +
|
| 124 | +We can also add a column-family (while the table is still disabled). The new column family called `revision`. |
| 125 | +
|
| 126 | +```shell |
| 127 | +alter 'wiki', { NAME => 'revision', VERSIONS => org.apache.hadoop.hbase.HConstants::ALL_VERSIONS } |
| 128 | +``` |
| 129 | +
|
| 130 | +Similar to the `text` column family, the `revision` column family is added without any columns. |
| 131 | +
|
| 132 | +It is upon the user to honour the schema. However, if the user decides not to honour the schema, e.g., by adding data to `revision:new_column`, HBase will not stop them. |
| 133 | +
|
| 134 | +Lastly, we can set the compression method as follows: |
| 135 | +
|
| 136 | +```shell |
| 137 | +alter 'wiki', {NAME=>'text', COMPRESSION=>'GZ', BLOOMFILTER=>'ROW'} |
| 138 | +``` |
| 139 | +
|
| 140 | +* Enable the table |
| 141 | +
|
| 142 | +```shell |
| 143 | +enable 'wiki' |
| 144 | +``` |
0 commit comments