Skip to content

Commit 8009a0a

Browse files
committed
deleted files, improved python/numpy basics
1 parent 38ab45e commit 8009a0a

8 files changed

+3542
-158
lines changed

01-Principles/01 Python Basics.ipynb

+57-4
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,60 @@
110110
"source": [
111111
"The difference between the two is that using double quotes makes it easy to include apostrophes (whereas these would terminate the string if using single quotes). There are additional variations on defining strings that make it easier to include things such as carriage returns, backslashes and Unicode characters.\n",
112112
"\n",
113-
"We can use single operators on numbers and strings, such as concatenation:"
113+
"The dynamic typing means that the same variable name could be used for multiple different data types, for an example, look at this C code:\n",
114+
"\n",
115+
"```C\n",
116+
"/* C code */\n",
117+
"int result = 0;\n",
118+
"for (int i = 0; i < 100; i++)\n",
119+
"{\n",
120+
" result += i;\n",
121+
"}\n",
122+
"```\n",
123+
"\n",
124+
"While in Python the equivalent operation could be written as:\n",
125+
"\n",
126+
"```python\n",
127+
"# python code\n",
128+
"result = 0\n",
129+
"for i in range(100):\n",
130+
" result += i\n",
131+
"```\n",
132+
"\n",
133+
"Notice that the `result` variable is explicitly declared, whereas in Python it is not explicitly an integer. See what happens in C if `result` is set a string variable:\n",
134+
"\n",
135+
"```C\n",
136+
"/* C code */\n",
137+
"int x = 4;\n",
138+
"x = \"four\"; // FAILS\n",
139+
"```\n",
140+
"\n",
141+
"Whereas in Python:\n",
142+
"\n",
143+
"```python\n",
144+
"# python code\n",
145+
"x = 4\n",
146+
"x = \"four\" # overwritten\n",
147+
"```"
148+
]
149+
},
150+
{
151+
"cell_type": "markdown",
152+
"metadata": {},
153+
"source": [
154+
"In Python, all types such as integers and strings are not *primitives*, but also *objects*. In fact, the standard Python implementation is written in C, meaning that a Python `int` is actually a C `struct`:\n",
155+
"\n",
156+
"```C\n",
157+
"/* C code */\n",
158+
"struct _longobject {\n",
159+
" long ob_refcnt;\n",
160+
" PyTypeObject *ob_type;\n",
161+
" size_t ob_size;\n",
162+
" long ob_digit[1];\n",
163+
"};\n",
164+
"```\n",
165+
"\n",
166+
"The same is true for all Python types, and even objects, which we will come onto later. We can use single operators on numbers and strings, such as concatenation:"
114167
]
115168
},
116169
{
@@ -1358,7 +1411,7 @@
13581411
"source": [
13591412
"## Dictionaries\n",
13601413
"\n",
1361-
"A dictionary is a data type similar to arrays, but works with keys and values instead of indexes. Each value stored in a dictionary can be accessed using a key, which is any type of object (a string, a number, a list, etc.) instead of using its index to address it.\n",
1414+
"A dictionary (previously known as a *hash table*) is a data type similar to arrays, but works with keys and values instead of indexes. Each value stored in a dictionary can be accessed using a key, which is any type of object (a string, a number, a list, etc.) instead of using its index to address it.\n",
13621415
"\n",
13631416
"For example, a database of phone numbers could be stored using a dictionary like this:"
13641417
]
@@ -1704,7 +1757,7 @@
17041757
],
17051758
"metadata": {
17061759
"kernelspec": {
1707-
"display_name": "Python [default]",
1760+
"display_name": "Python 3",
17081761
"language": "python",
17091762
"name": "python3"
17101763
},
@@ -1718,7 +1771,7 @@
17181771
"name": "python",
17191772
"nbconvert_exporter": "python",
17201773
"pygments_lexer": "ipython3",
1721-
"version": "3.5.6"
1774+
"version": "3.6.7"
17221775
}
17231776
},
17241777
"nbformat": 4,

02-Simulation/01 NumPy Basics.ipynb

+81-26
Original file line numberDiff line numberDiff line change
@@ -11,14 +11,39 @@
1111
"NumPy arrays therefore **must** be the same datatype (float, int etc).\n",
1212
"\n",
1313
"The flow of this notebook is as follows:\n",
14-
"1. Creating an array\n",
15-
"2. Creating zeros, ones, linspace...\n",
14+
"1. Supported types\n",
15+
"1. Creating an array from scratch\n",
1616
"3. Generating random numbers\n",
1717
"4. Inspecting the array\n",
1818
"5. Arithmetic operations\n",
1919
"6. Aggregation\n",
2020
"7. Subsetting, slicing, indexing\n",
2121
"\n",
22+
"## Supported types\n",
23+
"\n",
24+
"NumPy arrays contain values of a *single type*, so it's important to know which types are available to use, in addition to the fact that NumPy is built in C. These can usually be specified in NumPy array-creating functions using the `dtype` parameter:\n",
25+
"\n",
26+
"| **Data Type** | **Description** |\n",
27+
"| ---------- | ----------------------------- |\n",
28+
"| `bool_` | Boolean (`True` or `False` stored as a byte |\n",
29+
"| `int_` | Default integer type (same as C `long`) |\n",
30+
"| `intc` | Identical to C `int` |\n",
31+
"| `intp` | Integer for indexing |\n",
32+
"| `int8` | A byte |\n",
33+
"| `int16` | Integer (-32768 to 32767) |\n",
34+
"| `int32` | Integer (-2147483648 to 2147483647) |\n",
35+
"| `int64` | Integer (-9223372036854775808 to 9223372036854775807) |\n",
36+
"| `uint8` | Unsigned Integer (0 to 255) |\n",
37+
"| `uint16` | Unsigned Integer (0 to 65535) |\n",
38+
"| `uint32` | Unsigned Integer (0 to 4294967295) |\n",
39+
"| `uint64` | Unsigned Integer (0 to 18446744073709551615) |\n",
40+
"| `float_` | Same as `float64` |\n",
41+
"| `float16` | Half precision float (1-bit sign, 5-bit exponent, 10-bit mantissa) |\n",
42+
"| `float32` | Single precision float (1-bit sign, 8-bit exponent, 23-bit mantissa) |\n",
43+
"| `float64` | Double precision float (1-bit sign, 11-bit exponent, 52-bit mantissa) |\n",
44+
"|`complex_` | Same as `complex128` |\n",
45+
"| `complex64` | Complex number, as two 32-bit floats |\n",
46+
"| `complex128` | Complex number, represented by two 64-bit floats |\n",
2247
"\n",
2348
"We use the following convention **np** for numpy import:"
2449
]
@@ -36,7 +61,7 @@
3661
"cell_type": "markdown",
3762
"metadata": {},
3863
"source": [
39-
"## Creating an array\n",
64+
"## Creating arrays from scratch\n",
4065
"\n",
4166
"We can initialize numpy arrays from nested Python lists, and access elements using square brackets:"
4267
]
@@ -61,7 +86,7 @@
6186
"outputs": [],
6287
"source": [
6388
"# 2-d ints\n",
64-
"b = np.array([[3.0, 2.0],[1.0, 2.0]], dtype=int)\n",
89+
"b = np.array([[3.0, 2.0],[1.0, 2.0]], dtype=np.int_)\n",
6590
"b"
6691
]
6792
},
@@ -87,8 +112,6 @@
87112
"cell_type": "markdown",
88113
"metadata": {},
89114
"source": [
90-
"## Creating zeros, ones, linspace, identity matrix...\n",
91-
"\n",
92115
"Numpy also provides many functions to create arrays from the same value or not, for example `zeros()` creates an array full of zeros, given a specific size (or tuple of dimensions!), and `linspace()` creates an incrementally-ordered vector of numbers between two given values, and given a size."
93116
]
94117
},
@@ -170,6 +193,23 @@
170193
"g"
171194
]
172195
},
196+
{
197+
"cell_type": "markdown",
198+
"metadata": {},
199+
"source": [
200+
"An even faster form of allocation, if you're going to fill the array/matrix yourself, is to allocate an *empty array*, meaning the values are uninitialized:"
201+
]
202+
},
203+
{
204+
"cell_type": "code",
205+
"execution_count": null,
206+
"metadata": {},
207+
"outputs": [],
208+
"source": [
209+
"c2 = np.empty((10,), dtype=np.float64)\n",
210+
"c2"
211+
]
212+
},
173213
{
174214
"cell_type": "markdown",
175215
"metadata": {},
@@ -227,7 +267,28 @@
227267
"cell_type": "markdown",
228268
"metadata": {},
229269
"source": [
230-
"## Data Types"
270+
"## Array attributes\n",
271+
"\n",
272+
"The three primary attributes a NumPy array has are:\n",
273+
"\n",
274+
"1. Number of dimensions (`ndim`)\n",
275+
"2. Array shape (`shape`)\n",
276+
"3. Total size of the array (`size`)\n",
277+
"\n",
278+
"If we begin by allocating a 1-D, 2-D and 3-D array:"
279+
]
280+
},
281+
{
282+
"cell_type": "code",
283+
"execution_count": null,
284+
"metadata": {},
285+
"outputs": [],
286+
"source": [
287+
"# for reproducibility\n",
288+
"np.random.seed(0)\n",
289+
"array1d = np.random.randint(10, size=6)\n",
290+
"array2d = np.random.randint(10, size=(3, 4))\n",
291+
"array3d = np.random.randint(10, size=(3,4,5))"
231292
]
232293
},
233294
{
@@ -236,19 +297,19 @@
236297
"metadata": {},
237298
"outputs": [],
238299
"source": [
239-
"print(np.int64)\n",
240-
"print(np.float64)\n",
241-
"print(np.bool)\n",
242-
"print(np.string_)"
300+
"def print_array(arr):\n",
301+
" print(\"array ndim: %d\" % arr.ndim)\n",
302+
" print(\"array shape: {}\".format(arr.shape))\n",
303+
" print(\"array size: %d\" % arr.size)"
243304
]
244305
},
245306
{
246-
"cell_type": "markdown",
307+
"cell_type": "code",
308+
"execution_count": null,
247309
"metadata": {},
310+
"outputs": [],
248311
"source": [
249-
"## Array shape\n",
250-
"\n",
251-
"Returns a *tuple* whereby the first number represents the **number of rows** (FORTRAN-style memory-mapping!), the second number represents the **number of columns** and so on into higher dimensions."
312+
"print_array(array1d)"
252313
]
253314
},
254315
{
@@ -257,9 +318,7 @@
257318
"metadata": {},
258319
"outputs": [],
259320
"source": [
260-
"print(a.shape)\n",
261-
"print(j.shape)\n",
262-
"print(c.shape)"
321+
"print_array(array2d)"
263322
]
264323
},
265324
{
@@ -268,8 +327,7 @@
268327
"metadata": {},
269328
"outputs": [],
270329
"source": [
271-
"# the number of dimensions!\n",
272-
"b.ndim"
330+
"print_array(array3d)"
273331
]
274332
},
275333
{
@@ -278,8 +336,7 @@
278336
"metadata": {},
279337
"outputs": [],
280338
"source": [
281-
"print(a.dtype)\n",
282-
"print(b.dtype)"
339+
"print(array3d.dtype)"
283340
]
284341
},
285342
{
@@ -1200,9 +1257,7 @@
12001257
},
12011258
{
12021259
"cell_type": "markdown",
1203-
"metadata": {
1204-
"collapsed": true
1205-
},
1260+
"metadata": {},
12061261
"source": [
12071262
"### Task 3.\n",
12081263
"\n",
@@ -1235,7 +1290,7 @@
12351290
"name": "python",
12361291
"nbconvert_exporter": "python",
12371292
"pygments_lexer": "ipython3",
1238-
"version": "3.6.6"
1293+
"version": "3.6.7"
12391294
}
12401295
},
12411296
"nbformat": 4,

05-Learning/02 Model and Feature Selection.ipynb

-32
This file was deleted.

05-Learning/03 Preprocessing.ipynb

-32
This file was deleted.

05-Learning/04 Unsupervised.ipynb

-32
This file was deleted.

0 commit comments

Comments
 (0)