# (2016. 10) Bytenet

* Submitted on 2016. 10
* Nal Kalchbrenner, Lasse Espeholt, Karen Simonyan, Aaron van den Oord, Alex Graves and Koray Kavukcuoglu

## Simple Summary

> The ByteNet is a one-dimensional convolutional neural network that is composed of two parts, one to encode the source sequence and the other to decode the target sequence. The two network parts are connected by stacking the decoder on top of the encoder and preserving the temporal resolution of the sequences. To address the differing lengths of the source and the target, we introduce an efficient mechanism by which the decoder is dynamically unfolded over the representation of the encoder. The ByteNet uses dilation in the convolutional layers to increase its receptive field.

* Seq2Seq (RNN) model's drawbacks grow more severe as the length of the sequences increases.
* Machine Translation Desiderata
  1. the running time of the network should be linear in the length of the source and target strings.
  2. the size of the source representation should be linear in the length of the source string, i.e. it should be resolution preserving, and not have constant size.
  3. the path traversed by forward and backward signals in the network (between input and ouput tokens) should be short.&#x20;

![images](https://1712266326-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LMrEcS7cR9bGTHSnCnB%2F-LRazIqPSIKca1ujTQk5%2F-LRazK9bJqZtCdb6Z4M7%2Fbytenet_1.png?generation=1542547402795852\&alt=media)

* ByteNet
  * Encoder-Decoder Stacking: for maximize the representational bandwidth between the encoder and the decoder
  * Dynamic Unfolding: generates variable-length outputs (maintaining high bandwidth and being resolution-preserving)
  * Input Embedding Tensor
  * Masked One-dimensional Convolutions: The masking ensures that information from future tokens does not affect the prediction of the current token.
  * Dilation: Dilation makes the receptive field grow exponentially in terms of the depth of the networks, as opposed to linearly.
  * Residual Blocks

![images](https://1712266326-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LMrEcS7cR9bGTHSnCnB%2F-LRazIqPSIKca1ujTQk5%2F-LRazK9dMtI5VkQjxdea%2Fbytenet_2.png?generation=1542547406958844\&alt=media)

* The ByteNet also achieves state-of-the-art performance on character-to-character machine translation on the English-to-German WMT translation task, surpassing comparable neural translation models that are based on recurrent networks with attentional pooling and run in quadratic time.
* similar WaveNet + PixelCNN


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://humanbrain.gitbook.io/notes/notes/natural-language-processing/bytenet.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
