Skip to content

Add ipynb tutorial for text summaries #4718

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 4, 2021
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
171 changes: 134 additions & 37 deletions docs/text_summaries.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 1,
"metadata": {
"cellView": "form",
"id": "su2RaORHpReL"
Expand Down Expand Up @@ -60,9 +60,9 @@
"source": [
"## Overview\n",
"\n",
"Using the **TensorFlow Text Summary API,** you can easily log arbitrary text and view it in TensorBoard. This can be extremely helpful to sample and examine your input data, or to [record execution metadata]() or [generated text](). You can also log diagnostic data as text that can be helpful in the course of your model development.\n",
"Using the **TensorFlow Text Summary API,** you can easily log arbitrary text and view it in TensorBoard. This can be extremely helpful to sample and examine your input data, or to record execution metadata or generated text. You can also log diagnostic data as text that can be helpful in the course of your model development.\n",
"\n",
"In this tutorial, you will try out some basic use cases of the Summary API."
"In this tutorial, you will try out some basic use cases of the Text Summary API."
]
},
{
Expand All @@ -76,7 +76,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 2,
"metadata": {
"id": "3U5gdCw_nSG3"
},
Expand All @@ -94,14 +94,21 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 3,
"metadata": {
"id": "1qIKtOBrqc9Y"
},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"TensorFlow version: 2.5.0-dev20210219\n"
]
}
],
"source": [
"import tensorflow as tf\n",
"from tensorboard.plugins.text.summary_v2 import text\n",
"\n",
"from datetime import datetime\n",
"import json\n",
Expand All @@ -121,12 +128,12 @@
"source": [
"## Logging a single piece of text\n",
"\n",
"To understand how the Image Summary API works, you're going to simply log a bit of text and see how it is presented in tensorboard.\n"
"To understand how the Text Summary API works, you're going to simply log a bit of text and see how it is presented in TensorBoard.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 4,
"metadata": {
"id": "FxMPcdmvBn9t"
},
Expand All @@ -137,7 +144,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 5,
"metadata": {
"id": "IJNpyVyxbVtT"
},
Expand All @@ -153,7 +160,7 @@
"\n",
"# Using the file writer, log the text.\n",
"with file_writer.as_default():\n",
" tf.summary.text(\"My first text\", my_text, step=0)"
" tf.summary.text(\"first_text\", my_text, step=0)"
]
},
{
Expand All @@ -162,16 +169,43 @@
"id": "rngALbRogXe6"
},
"source": [
"Now, use TensorBoard to examine the image. Wait a few seconds for the UI to spin up."
"Now, use TensorBoard to examine the text. Wait a few seconds for the UI to spin up."
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 6,
"metadata": {
"id": "T_X-wIy-lD9f"
},
"outputs": [],
"outputs": [
{
"data": {
"text/html": [
"\n",
" <iframe id=\"tensorboard-frame-9381929c3767b97b\" width=\"100%\" height=\"800\" frameborder=\"0\">\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want to cache the output for this iframe? I wouldn't expect it will render anything useful (outside the tensorflow.org site, where I think it gets replaced by a static image), so it seems better to omit IMO.

Ditto elsewhere below where we have the iframe.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed,

Aside, I've been editing this file in the jupyter lab UI. Is there an easy way to drop the cell output there, or do I need to edit somewhere else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know whether there's a way to do that in the jupyter lab UI; I am not very familiar with it. But you can always just edit the JSON file.

" </iframe>\n",
" <script>\n",
" (function() {\n",
" const frame = document.getElementById(\"tensorboard-frame-9381929c3767b97b\");\n",
" const url = new URL(\"/\", window.location);\n",
" const port = 6007;\n",
" if (port) {\n",
" url.port = port;\n",
" }\n",
" frame.src = url;\n",
" })();\n",
" </script>\n",
" "
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%tensorboard --logdir logs"
]
Expand All @@ -195,12 +229,12 @@
"\n",
"If you have multiple streams of text, you can keep them in separate namespaces to help organize them, just like scalars or other data.\n",
"\n",
"Note that if you log text at many steps, TensorBoard will subsample the steps to display. Again, this behavior is the same as for scalars, where the data is subsampled to make the presentation manageable."
"Note that if you log text at many steps, TensorBoard will subsample the steps to display so as to make the presentation manageable. You can control the sampling rate using the `--samples_per_plugin` flag."
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 7,
"metadata": {
"id": "dda6960f0119"
},
Expand All @@ -213,24 +247,51 @@
"\n",
"# Using the file writer, log the text.\n",
"with file_writer.as_default():\n",
" with tf.name_scope(\"Name scope 1\"):\n",
" with tf.name_scope(\"name_scope_1\"):\n",
" for step in range(20):\n",
" tf.summary.text(\"a stream of text\", f\"Hello from step {step}\", step=step)\n",
" tf.summary.text(\"another stream of text\", f\"This can be kept separate {step}\", step=step)\n",
" with tf.name_scope(\"Name scope 2\"):\n",
" tf.summary.text(\"just from step 0\", f\"This is an important announcement from step 0\", step=0)\n",
" tf.summary.text(\"a_stream_of_text\", f\"Hello from step {step}\", step=step)\n",
" tf.summary.text(\"another_stream_of_text\", f\"This can be kept separate {step}\", step=step)\n",
" with tf.name_scope(\"name_scope_2\"):\n",
" tf.summary.text(\"just_from_step_0\", \"This is an important announcement from step 0\", step=0)\n",
" "
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 14,
"metadata": {
"id": "515199f4b547"
},
"outputs": [],
"outputs": [
{
"data": {
"text/html": [
"\n",
" <iframe id=\"tensorboard-frame-4a8120aa135ba5f2\" width=\"100%\" height=\"800\" frameborder=\"0\">\n",
" </iframe>\n",
" <script>\n",
" (function() {\n",
" const frame = document.getElementById(\"tensorboard-frame-4a8120aa135ba5f2\");\n",
" const url = new URL(\"/\", window.location);\n",
" const port = 6010;\n",
" if (port) {\n",
" url.port = port;\n",
" }\n",
" frame.src = url;\n",
" })();\n",
" </script>\n",
" "
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%tensorboard --logdir logs"
"%tensorboard --logdir logs/multiple_texts --samples_per_plugin 'text=5'"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a better example to set the sample count to something higher (e.g. at least 20, since that's the step count here, or maybe something like 100). It seems a lot more likely that folks will want to increase the count than decrease it (relative to the default value of 10).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intentionally kept 5 here to illustrate to the user that subsampling occurred. I suspect that this will encourage users to play with the parameter. If we set it higher than 20, users will probably overlook it.

]
},
{
Expand All @@ -239,14 +300,14 @@
"id": "bjACE1lAsqUd"
},
"source": [
"## Using markdown\n",
"## Markdown interpretation\n",
"\n",
"Tensorboard supports logging text in markdown to make it easier to read and understand."
"TensorBoard interprets text summaries as Markdown, since rich formatting can make the data you log easier to read and understand, as shown below. (If you don't want Markdown interpretation, see [this issue](https://github.com/tensorflow/tensorboard/issues/830) for workarounds to suppress interpretation.)"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 9,
"metadata": {
"id": "iHUjCXbetIpb"
},
Expand All @@ -272,20 +333,20 @@
"}\n",
"\n",
"\n",
"# TODO: Update this example when TensorBoard is released with\n",
"# https://github.com/tensorflow/tensorboard/pull/4585\n",
"# which supports fenced codeblocks in Markdown.\n",
"def pretty_json(hp):\n",
" json_hp = json.dumps(hp, indent=2)\n",
" s = \"\"\n",
" for line in json_hp.splitlines():\n",
" s += \"\\t\" + line + \"\\n\"\n",
" return s\n",
" return \"\".join(\"\\t\" + line for line in json_hp.splitlines(True))\n",
"\n",
"markdown_text = \"\"\"\n",
"### Markdown Text\n",
"\n",
"TensorBorad supports a number of markdown idioms, such as\n",
"TensorBoard supports basic markdown syntax, including:\n",
"\n",
" preformatted code\n",
" \n",
"\n",
"**bold text**\n",
"\n",
"| and | tables |\n",
Expand All @@ -294,18 +355,54 @@
"\"\"\"\n",
"\n",
"with file_writer.as_default():\n",
" tf.summary.text(\"Run Params\", pretty_json(some_obj_worth_noting), step=0)\n",
" tf.summary.text(\"Markdown jubiliee\", markdown_text, step=0)\n",
" tf.summary.text(\"run_params\", pretty_json(some_obj_worth_noting), step=0)\n",
" tf.summary.text(\"markdown_jubiliee\", markdown_text, step=0)\n",
" "
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 10,
"metadata": {
"id": "57082d8d6839"
},
"outputs": [],
"outputs": [
{
"data": {
"text/plain": [
"Reusing TensorBoard on port 6008 (pid 44187), started 1 day, 16:31:45 ago. (Use '!kill 44187' to kill it.)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably omit this cached output as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped all of them.

]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" <iframe id=\"tensorboard-frame-d949964a890a033\" width=\"100%\" height=\"800\" frameborder=\"0\">\n",
" </iframe>\n",
" <script>\n",
" (function() {\n",
" const frame = document.getElementById(\"tensorboard-frame-d949964a890a033\");\n",
" const url = new URL(\"/\", window.location);\n",
" const port = 6008;\n",
" if (port) {\n",
" url.port = port;\n",
" }\n",
" frame.src = url;\n",
" })();\n",
" </script>\n",
" "
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%tensorboard --logdir logs/markdown"
]
Expand Down