updated assets and wording for notebook 3

This commit is contained in:
bentrevett 2019-03-10 15:45:40 +00:00
parent 14edc64ff8
commit 977613215e
9 changed files with 21 additions and 13 deletions

View File

@ -50,7 +50,7 @@
{
"data": {
"text/plain": [
"['This', 'film', 'is', 'terrible', 'This film', 'film is', 'is terrible']"
"['This', 'film', 'is', 'terrible', 'film is', 'This film', 'is terrible']"
]
},
"execution_count": 2,
@ -66,7 +66,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"TorchText `Field`s have a `preprocessing` argument. A function passed here will be applied to a sentence after it has been tokenized (transformed from a string into a list of tokens), but before it has been indexed (transformed from a list of tokens to a list of indexes). This is where we'll pass our `generate_bigrams` function."
"TorchText `Field`s have a `preprocessing` argument. A function passed here will be applied to a sentence after it has been tokenized (transformed from a string into a list of tokens), but before it has been numericalized (transformed from a list of tokens to a list of indexes). This is where we'll pass our `generate_bigrams` function."
]
},
{
@ -157,19 +157,23 @@
"\n",
"This model has far fewer parameters than the previous model as it only has 2 layers that have any parameters, the embedding layer and the linear layer. There is no RNN component in sight!\n",
"\n",
"Instead, it first calculates the word embedding for each word using the `Embedding` layer (purple), then calculates the average of all of the word embeddings and feeds this through the `Linear` layer (silver), and that's it!\n",
"Instead, it first calculates the word embedding for each word using the `Embedding` layer (blue), then calculates the average of all of the word embeddings (pink) and feeds this through the `Linear` layer (silver), and that's it!\n",
"\n",
"![](https://i.imgur.com/e0sWZoZ.png)\n",
"![](assets/sentiment8.png)\n",
"\n",
"We implement the averaging with the `avg_pool2d` (average pool 2-dimensions) function. Initially, you may think using a 2-dimensional pooling seems strange, surely our sentences are 1-dimensional, not 2-dimensional? However, you can think of the word embeddings as a 2-dimensional grid, where the words are along one axis and the dimensions of the word embeddings are along the other. The image below is an example sentence after being converted into 5-dimensional word embeddings, with the words along the vertical axis and the embeddings along the horizontal axis. Each element in this [4x5] tensor is represented by a green block.\n",
"\n",
"![](https://i.imgur.com/SSH25NT.png)\n",
"![](assets/sentiment9.png)\n",
"\n",
"The `avg_pool2d` uses a filter of size `embedded.shape[1]` (i.e. the length of the sentence) by 1. This is shown in pink in the image below.\n",
"\n",
"![](https://i.imgur.com/U7eRnIe.png)\n",
"![](assets/sentiment10.png)\n",
"\n",
"The filter then slides to the right, calculating the average of the next column of embedding values for each word in the sentence. After the filter has covered all embedding dimensions, we get a [1x5] tensor. This tensor is then passed through the linear layer to produce our prediction."
"We calculate the average value of all elements covered by the filter, then the filter then slides to the right, calculating the average over the next column of embedding values for each word in the sentence. \n",
"\n",
"![](assets/sentiment11.png)\n",
"\n",
"Each filter position gives us a single value, the average of all covered elements. After the filter has covered all embedding dimensions we get a [1x5] tensor. This tensor is then passed through the linear layer to produce our prediction."
]
},
{
@ -188,11 +192,11 @@
" self.embedding = nn.Embedding(vocab_size, embedding_dim)\n",
" self.fc = nn.Linear(embedding_dim, output_dim)\n",
" \n",
" def forward(self, x):\n",
" def forward(self, text):\n",
" \n",
" #x = [sent len, batch size]\n",
" #text = [sent len, batch size]\n",
" \n",
" embedded = self.embedding(x)\n",
" embedded = self.embedding(text)\n",
" \n",
" #embedded = [sent len, batch size, emb dim]\n",
" \n",
@ -518,7 +522,7 @@
{
"data": {
"text/plain": [
"2.414095661151805e-07"
"2.414077187040675e-07"
]
},
"execution_count": 18,
@ -563,7 +567,7 @@
"source": [
"## Next Steps\n",
"\n",
"In the final notebook we'll use convolutional neural networks (CNNs) to perform sentiment analysis, and get our best accuracy yet!"
"In the next notebook we'll use convolutional neural networks (CNNs) to perform sentiment analysis, and get our best accuracy yet!"
]
}
],
@ -583,7 +587,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.5"
"version": "3.7.0"
}
},
"nbformat": 4,

BIN
assets/sentiment10.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.0 KiB

1
assets/sentiment10.xml Normal file
View File

@ -0,0 +1 @@
<mxfile modified="2019-03-10T15:32:39.941Z" host="www.draw.io" agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.92 Safari/537.36" etag="3283P7J3xhpbVsTwkbdB" version="10.4.1" type="device"><diagram id="SksJatzgLejQSg5PRCuB" name="Page-1">7VtbT9swFP41fRxK4tZJHkcpDG1Mk5g29mgSN/HmxJXr3vbr5zTODQMCJjiWSF6a8/l+vuPv1FY7QfNifyHJKr8SKeWTwEv3E3Q2CQLf97D+qJBDjWAvrIFMstRU6oBr9pca0DPohqV0PaiohOCKrYZgIsqSJmqAESnFblhtKfhw1BXJqAVcJ4Tb6E+WqrxGoyDs8E+UZXkzso/juqQgTWWzknVOUrHrQWgxQXMphKrfiv2c8sp5jV/qducPlLYTk7RUT2lwMf2exuf+j8vV9jM/XG7miyv8IZqayalDs2KaagcYU0iVi0yUhC869DTZyC2tevW1IcWmTI+Wp62uwRchVqbKb6rUwVBLNkpoKFcFN6V0z9RN7/1X1dXJzFhne9Pz0Tg0Rqnk4aZv9FpVZtfsaDXtlqJU56RgvALmYiMZlXr1X+nOFJpZBlXl2jOVOx70eBORRGZUPeJmHLaE651CRUH1rHRDSTlRbDscgJiQzdp6bdNvgumhA89sryg+iVD3TKO6A7PV4vgE955wOux+rZefUNNjFzf6pTfFDjpG0zMiC6F6uC3hG7O+y0mAuXbT6a12Os5U6+Ne9Cm6V8P4WCsp/tC54EJqpBRlFYJLxvkdiHCWldpMNDuaVXS6pVIxvZM/moKCpekxfnc5U/R6RZJqzJ3WLSuMnxonftzYZgVmr1dD0/3jYWNHg2kw9QY0RsbcdfLT1Mh7ytNg90XPgNvnEhnNRol4A4kIHwiKV5UIH4FqBJ5aGpETRUeZeIFM+AG0TuBRJ95CJ2YQOhHC6sTM0gmVs/WoEy/RCQytE+GoE2+hEzGATgQ+rE5gSyf0zi5GnXiBTgQesE7g0CLTom3ovvsc3GP0WTvRorVP/SRAy2WiHytOdAnCKEZpG2T/xYmP3DoL4shlTpKU3ka3r85J6Bgn8chJe13sCCdNiLxrTmaOceK7zAlMPgG/NAgDl0mBSSjwpNhX9++OlLsZBZ4U+670/ZEyc40U+2LKIVKAcgr0BVNo3wI4RApQTgEnxenTPFBOASdlPM7bOQWcFKfP8zA5Bfwysjm8ukkKTE6BJ8XpEz1MToEnZTzRWzkFnpQx0QdB7NZ9ZOR0nofhBPxAH4953iYF+htxPOZ5i5RXTCna7P6VUP80o/tvB1r8Aw==</diagram></mxfile>

BIN
assets/sentiment11.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.6 KiB

1
assets/sentiment11.xml Normal file
View File

@ -0,0 +1 @@
<mxfile modified="2019-03-10T15:33:21.603Z" host="www.draw.io" agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.92 Safari/537.36" etag="V5KUOpk1s79RdzXgweLR" version="10.4.1" type="device"><diagram id="SksJatzgLejQSg5PRCuB" name="Page-1">7Vtdb5swFP01PK4CnBh4XNO0q7ZOkzpt3aMDDngzOHJMQvbrZ4L5qtuq7dTYUuEl3OPve67Pja3EAYu8uuJok92wBFPHd5PKAReO73ueC+VHjRwaBLpBA6ScJKpSD9ySv1iBrkJLkuDtqKJgjAqyGYMxKwocixGGOGf7cbU1o+NRNyjFGnAbI6qjP0kisgYN/aDHP2GSZu3IHoyakhy1ldVKthlK2H4AgaUDFpwx0bzl1QLT2nmtX5p2l4+UdhPjuBDPaXA1+55El96P683uMz1cl4vlDfwQztTkxKFdMU6kA5TJuMhYygpElz16Hpd8h+tePWlwVhbJ0XKl1Tf4wthGVfmNhTgoalEpmIQykVNViisi7gbvv+quzubKuqhUz0fj0BqF4Ie7oTFoVZt9s6PVtluzQlyinNAaWLCSE8zl6r/ivSpUs/Tryo1nanc86vE2IhFPsXjCzTDoCJc7BbMcy1nJhhxTJMhuPABSIZt29bqm3xiRQ/uu2l5hdBaC/pmFTQdqq0XRGRw8wWzc/VYuP8aqxz5u5Mtgij10jKYXRBYAzXA7REu1vmvHh1S66XwlnQ5T0fl4EH0CV2IcH1vB2R+8YJRxiRSsqENwTSi9ByFK0kKasWRHsgrOd5gLInfyR1WQkyQ5xu8+IwLfblBcj7mXuqWF8XPjxItaW61A7fV6aFw9HTZ6NKgGM3dEY6jMfS8/bY1soDwt9lD0jLh9KZHhfJKIE0hE8EhQvKlEeMCoRsCZphEZEniSiVfIhOeb1gk46cQpdGJuQicCszox13RCZGQ76cRrdAKa1olg0olT6ERkQCd8z6xOQE0n5M7OJ514hU74rmGdgIFGpkbb2H0POXjA6It2okbrkHrHB3GCV+FKixNZAiCIQNIF2X9x4gG7zoIwtJmT9TqWz5tzEljGSWQzJ6fZJ911sSWctCHyrjmZW8aJN3FyP58YvzQIfJtJMZNQzJOiX91bRIqZjGKeFP2u9P2RMreNFP1i6t2RouUU0xdMgX4LYBEphnKKcVKm07yeU4yTYvVx3lBOMU7KdJ7Xcorxy8j28GonKWZyinlSphO9llPMk2L1id5MTjFPypTofT+y6z4ynPK8xonxA31kdZ43RIrpb8TRlFI0Ut4wpUiz/1dC89OM/r8dYPkP</diagram></mxfile>

BIN
assets/sentiment8.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

1
assets/sentiment8.xml Normal file
View File

@ -0,0 +1 @@
<mxfile modified="2019-03-10T14:54:03.829Z" host="www.draw.io" agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.92 Safari/537.36" etag="ZEvUI-DQWfaSEmThm-qh" version="10.4.1" type="device"><diagram id="SksJatzgLejQSg5PRCuB" name="Page-1">7ZpLc5swEIB/jY/NAAKBj43jPKZNpzPptM1RAQFqBXJl4Ud/fSUjDBgnIY0JdgZyCLvSCrG738oIRmCSrK44msW3LMB0ZBnBagQuRpZlmgaU/5RmnWug4eaKiJNAdyoVd+Qv1kpDazMS4Hmto2CMCjKrK32WptgXNR3inC3r3UJG61edoQg3FHc+ok3tDxKIONd6llvqrzGJ4uLKJhznLQkqOus7mccoYMuKCkxHYMIZE/lZsppgqpxX+CW3u3ykdTsxjlPRxuDK/haML83vN7PFJ7q+ySbTW/jBKSYn1sUd40A6QIuMi5hFLEV0WmrPOcvSAKthDSmVfT4zNpNKUyp/YSHWOpooE0yqYpFQ3YpXRPxU5meOlu71YOr8YlUV1oWQCr6uGCnxvhhPCaXZRirsQpaKS5QQqhQTlnGCubzhL3ipG/UsLdU5d4bywKNO1qq5HMnHT3gWAJ2tiEdYPNHRhttkkBRhlmA5fWnIMUWCLOozQTqdo22/MuLyRAf9BQlQTHOBaKYvdTOyIJUzPn+QjoKR2PqlkiQCr0Q9pnPB2W88YZRxqUlZqjIlJJTuqBAlUSpFX3pURgKcLzAXRAL3UTckJAg2abaMicB3M7Rx81KWl0bqtY2tOS5kfQc669Wl8erpUDcDow1sPYgubABqeVmWiaJLXKkQwOgoko7ZiNKA8kFQtluj7PaJst1EOUYCDzS3odm0jg1na8C5I5zttjh7veJsN3AWMZkPOLfB2TKODWcw4NwRzk5bnMe94uw0cJYAJgPOrXD2jDPbsKBnOxC4jmcfGdywEbT3i+7zZcvP+GJz6+aBSYctSXecXkmHDdKfSY999FVS4UXhaTBfrQsjCwQIe6HfKCKyBfoefgi3ITvkw7HktxWhdmeEusPyW0mSzgh1T4NQdyB094G3f0S9AdG3QNQ7DUS9AdHdh9j+ER2fCqLHgdprHzi16VdG5KXLyu3tpIWxE+8cbW21E/LtNF6B5nhAUz2QOscFJ/yf9z+11H+/i+kBoXbabjdBY38Gvc366TS3m44I0jD05bEPUgDBGAQH+okL64XSbLlP1B2izRicAqJ9ofZqgh5dP89M17JN4I6BaYB6Hd8NfseLaZGSR8qpo/72LqabQw9Z5XdzdMNv7yts8wV7sX2vvCBbQqQTGv7J1Hdd59eYLrDaci9VxT6/UdjKueTmxf7/e38jYHZR2zt7BSDF8iO9HPvyU0cw/Qc=</diagram></mxfile>

BIN
assets/sentiment9.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.9 KiB

1
assets/sentiment9.xml Normal file
View File

@ -0,0 +1 @@
<mxfile modified="2019-03-10T15:31:47.753Z" host="www.draw.io" agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.92 Safari/537.36" etag="AzVPMclZN3lZbafoIGNK" version="10.4.1" type="device"><diagram id="SksJatzgLejQSg5PRCuB" name="Page-1">7Vtdb5swFP01PK4CnBh4XNO0q7ZOkzpt3aMDLngzOHJMQvbrZ4L5ituq7dQaKc5LuMff91yfG1vBAYu8uuJond2wBFPHd5PKAReO73ueC+VXjewbBLpBA6ScJKpSD9ySv1iBrkJLkuDNqKJgjAqyHoMxKwocixGGOGe7cbV7RsejrlGKNeA2RlRHf5JEZA0a+kGPf8IkzdqRPRg1JTlqK6uVbDKUsN0AAksHLDhjonnKqwWmtfNavzTtLh8p7SbGcSGe0+Bq9j2JLr0f1+vtZ7q/LhfLG/ghnKnJiX27YpxIByiTcZGxlBWILnv0PC75Fte9etLgrCySg+VKq2/whbG1qvIbC7FX1KJSMAllIqeqFFdE3A2ef9Vdnc2VdVGpng/GvjUKwfd3Q2PQqjb7ZgerbXfPCnGJckJrYMFKTjCXq/+Kd6pQzdKvKzeeqd3xqMfbiEQ8xeIJN8OgI1zuFMxyLGclG3JMkSDb8QBIhWza1euafmNEDu27anuF0VkI+s8sbDpQWy2KzuDgE8zG3W/k8mOseuzjRj4MpthDh2h6QWQB0Ay3RbRU67t2fEilm85X0ukwFZ2PB9EncCXG8bERnP3BC0YZl0jBijoE7wmlRxCiJC2kGUt2JKvgfIu5IHInf1QFOUmSQ/zuMiLw7RrF9Zg7qVtaGD83TryotdUK1F6vh8bV02GjR4NqMHNHNIbK3PXy09bIBsrTYg9Fz4jblxIZzq1EvINEBI8ExZtKhAeMagScaRqRIYGtTLxCJjzftE5AqxPvoRNzEzoRmNWJuaYTIiMbqxOv0QloWicCqxPvoRORAZ3wPbM6ATWdkDs7tzrxCp3wXcM6AQONTI22sfsecvCA0RftRI3WIfWOD+IEr8KVFieyBEAQgaQLsv/ixAPTOgvC0HLiBRPjJLKcdNfFE+GkDZGT5mQ+MU48y8lxPjF+aRD4lpTjhGKeFP3q/uRIOc4o5knR70pPj5T51EjRL6ZOjhQtp5i+YAr0W4DTI+U4pxgnxZ7m9ZxinBR7nNdzinFS7HleyynGLyPbw+tJkxJMjRR7otdyinlS7IleyynmSbGJ3vejad1HhjbPa5wYP9BHNs/rpJj+RRzZPK+R8oYpRZr9WwnNXzP6dzvA8h8=</diagram></mxfile>