You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
collects the feedback from PostHog and generates an issue with the reader feedback. The same workflow queries Algolia
144
+
for the top successful searches, and the top failed searches in the docs. This informs the documentation team and
145
+
product management on which features or commands are important to readers.
146
+
147
+
== CI checks
148
+
149
+
Link checking, Markdown linting, and build tests are done on each commit to documentation pull requests by the https://github.com/StarRocks/starrocks/blob/main/.github/workflows/ci-doc-checker.yml#L62-L135[doc CI job^,target="_blank"].
Copy file name to clipboardExpand all lines: content/documentation/modules/ROOT/pages/index.adoc
+23-23
Original file line number
Diff line number
Diff line change
@@ -13,53 +13,53 @@ or have been modified.
13
13
This example is designed to be followed step by step to integrate the database with a specific
14
14
third-party visualization tool.
15
15
16
-
When I wrote this guide I pulled out the reusable content (the ports used by the database and where to
17
-
find the connection details in the commercial UI) into reusable snippets and these are imported wherever
16
+
When I wrote this guide, I pulled out the reusable content (the ports used by the database and where to
17
+
find the connection details in the commercial UI) into reusable snippets. These are now imported wherever
18
18
needed. Imports have always been available in Asciidoc, but this was missing in Docusaurus until recently.
19
19
20
20
Key features of this guide:
21
21
22
22
* Identify the goal and show the end result.
23
23
* Provide a step-by-step procedure.
24
24
* Include all the information necessary to complete the task.
25
-
* Limit links out to download files necessary to implement the integration and a sample dataset used to verify the integration.
25
+
* Limit links to those needed for downloading files to implement the integration and a sample dataset used to verify the integration.
26
26
27
27
The rest of the visualization tool integrations at ClickHouse follow the same pattern.
28
28
29
-
Typically, integration documentation would be limited to "install the driver, add the connection string, press the test button". I deviated from this because the community often had problems with using the integration once the connection was established. Deflecting issues reported in Slack and to the support desk is important to both the users and the support team.
29
+
Typically, integration documentation would be limited to "install the driver, add the connection string, press the test button". I deviated from this because the community often had problems using the integration once the connection was established. Deflecting issues reported in Slack and to the support desk is important to both the users and the support team.
This is a guide for someone who has already gone through the basics of starting the database, creating a
36
-
"Hello World" table and loading a few rows of data. This document exemplifies one place where I
37
-
combine explanation with a how-to guide. This is meant to both teach someone how to load data, and
36
+
"Hello World" table, and loading a few rows of data. This document exemplifies one place where I
37
+
combine explanation with a how-to guide. This is meant to both teach someone how to load data and
38
38
explain what they should be considering while they work through the process.
39
39
40
-
In the NYPD Complaint Data guide I guide a new user of the ClickHouse analytical database through
40
+
In the NYPD Complaint Data documentation, I guide a new user of the ClickHouse analytical database through
41
41
investigating the structure and content of an input file containing a dataset, determining the proper
42
42
schema for the database table that the data will be stored in, how to transform the data while
43
43
ingesting it, and how to run some interesting queries against that data.
44
44
45
-
Most guides in this product space tell the reader "type this, click that, clean up". I find that
45
+
Most guides in this product space tell the reader "Type this, click that, clean up". I find that
46
46
type of guide to be boring, and I wonder if the method presented is a "good" method or the simplest
47
47
for the author to write.
48
48
49
49
Database guides often use very simple datasets that are guaranteed to work. This is necessary for the
50
-
very first tutorialtype content designed to get the product installed and the very first table created.
50
+
very first tutorial-type content designed to get the product installed and the very first table created.
51
51
Beyond that point, the reader needs to learn about how to understand their data and the database so
52
52
that they can make proper decisions. When I wrote this guide I had almost no experience with the
53
-
product. My mentor recommended that I "figure it out and write down everything that I learned". This
54
-
first example is the result of that advice.
53
+
product. My mentor recommended that I "Write down everything that I learn while working through the
54
+
process". This first example is the result of that advice.
55
55
56
56
Here is an example from the NYPD Complaint Data document that I believe is a good way to present
57
57
a system for learning about the data, and properly configuring the database table to store the data
58
58
efficiently:
59
59
60
60
NOTE: The queries are not shown in the excerpt.
61
61
62
-
> In order to figure out what types should be used for the fields it is necessary to know what the data looks like. For example, the field JURISDICTION_CODE is a numeric: should it be a UInt8, or an Enum, or is Float64 appropriate?
62
+
> To figure out what types should be used for the fields it is necessary to know what the data looks like. For example, the field JURISDICTION_CODE is a numeric: should it be a UInt8, or an Enum, or is Float64 appropriate?
63
63
>
64
64
> The query response shows that the JURISDICTION_CODE fits well in a UInt8.
65
65
>
@@ -70,13 +70,13 @@ NOTE: The queries are not shown in the excerpt.
70
70
> The dataset in use at the time of writing has only a few hundred distinct parks and playgrounds in the PARK_NM column. This is a small number based on the LowCardinality recommendation to stay below 10,000 distinct strings in a LowCardinality(String) field.
71
71
72
72
The document continues to teach a few more very important techniques for analyzing and manipulating
73
-
data, and then finishes up with some queries and advice on what to learn next.
73
+
data and then finishes up with some queries and advice on what to learn next.
74
74
75
75
https://web.archive.org/web/20230317111529/https://clickhouse.com/docs/en/getting-started/example-datasets/nypd_complaint_data[ClickHouse guide to analyzing NYPD complaint data^,target="_blank"]
76
76
77
77
== Solution guides
78
78
79
-
The documentation at Elastic was traditionally productbased. This meant that the documentation was split up into these separate sets:
79
+
The documentation at Elastic was traditionally product-based. This meant that the documentation was split up into these separate sets:
80
80
81
81
* Search engine
82
82
* Visualization tool
@@ -85,13 +85,13 @@ The documentation at Elastic was traditionally product based. This meant that th
85
85
86
86
This separation of the documentation meant that the reader had to know which tools they needed, what terminology each of the tools used to describe the same idea, and which tool to pick if there were multiple options for a specific task. This issue hit me personally when I was trying to set up a new feature. I searched for the feature and the search engine documentation came up first in the results, so I followed that guide. I had to use pages of JSON configuration to get the integration working. I was speaking with some of the other writers about how difficult this was to configure, and the writer for the visualization tool told me that there was a button to configure that. This conversation led to regular knowledge sharing among the writers and the course developers so that we could provide end-to-end scenario-based documentation that highlighted the best way to accomplish tasks. There are several solution guides, and I worked on these:
87
87
88
-
https://www.elastic.co/guide/en/starting-with-the-elasticsearch-platform-and-its-solutions/current/getting-started-observability.html[Getting started with Observability^,target="_blank"]
88
+
https://www.elastic.co/guide/en/starting-with-the-elasticsearch-platform-and-its-solutions/current/getting-started-observability.html[Getting Started with Observability^,target="_blank"]
89
89
90
90
https://www.elastic.co/guide/en/starting-with-the-elasticsearch-platform-and-its-solutions/current/getting-started-kubernetes.html[Monitor your Kubernetes Infrastructure^,target="_blank"]
91
91
92
92
https://www.elastic.co/guide/en/starting-with-the-elasticsearch-platform-and-its-solutions/current/getting-started-siem-security.html[Use Elastic Security for SIEM^,target="_blank"]
93
93
94
-
== Use-casebased How-To guides
94
+
== Use-case-based How-To guides
95
95
96
96
I am not a fan of using four lines of data to introduce the reader to a database product capable of
97
97
ingesting and analyzing billions of rows of data or analyzing those billions of rows of data where they
@@ -103,10 +103,10 @@ that align with the needs of the community:
103
103
* Analyzing data in Apache Iceberg
104
104
* Analyzing data in Apache Hudi
105
105
106
-
There is some complexity to configuring the integrations with cloud storage, Apache Iceberg, and Apache Hudi. To make this easier for the reader I wrote Docker compose files to deploy MinIO, Iceberg, and Hudi. I think that this is appropriate, as the reader that wants to use external storage with StarRocks is likely familiar with the external storage. In addition to the compose files I documented the settings necessary, and in the case of the Hudi integration I submitted a pull request to the Hudi maintainers to improve their compose-based tutorial.
106
+
There is some complexity to configuring the integrations with cloud storage, Apache Iceberg, and Apache Hudi. To make this easier for the reader I wrote Docker compose files to deploy MinIO, Iceberg, and Hudi. I think that this is appropriate, as the reader who wants to use external storage with StarRocks is likely familiar with the external storage. In addition to the compose files I documented the settings necessary, and in the case of the Hudi integration I submitted a pull request to the Hudi maintainers to improve their compose-based tutorial.
107
107
108
108
The "Basics" Quick Start is a step-by-step guide with no explanation until the end. There are some
109
-
complicated manipulations of the data during loading. In the document I ask the reader to wait until they
109
+
complicated manipulations of the data during loading. In the document, I ask the reader to wait until they
110
110
have finished the entire process and promise to provide them with the details.
111
111
112
112
> The curl commands look complex, but they are explained in detail at the end of the tutorial. For now, we recommend running the commands and running some SQL to analyze the data, and then reading about the data loading details at the end.
I was working the Elastic booth at Kubecon 2018 and almost everyone who came to visit the booth told me that they loved
155
-
Elasticsearch. As the PMM for ingest products I was interested in what agents were popular with the community. All but a
156
-
handful of the people I spoke with were using Fluentd or Fluent Bit to feed Logstash. In order to raise awareness of Elastic
155
+
Elasticsearch. As the PMM for ingest products, I was interested in what agents were popular with the community. All but a
156
+
handful of the people I spoke with were using Fluentd or Fluent Bit to feed Logstash. To raise awareness of Elastic
157
157
agents similar in functionality to Fluentd and Fluent Bit I joined the Kubernetes SIG-Docs and published this guide in the
158
158
Kubernetes documentation.
159
159
@@ -173,5 +173,5 @@ https://www.elastic.co/customer-success/resources?tab=2[Elastic Support Engineer
173
173
174
174
Some people prefer a short video when they want an introduction to a new technique. I recorded this to give people an overview of the https://www.youtube.com/watch?v=IO_uXPKQht0[Elastic Kubernetes operator^,target="_blank"].
175
175
176
-
There are more blogs, videos, and webinars available in the
176
+
There are more blogs, videos, and webinars available on the
0 commit comments