Questions tagged [sequence-to-sequence]

121 questions
17
votes
3 answers

How to determine feature importance in a neural network?

I have a neural network to solve a time series forecasting problem. It is a sequence-to-sequence neural network and currently it is trained on samples each with ten features. The performance of the model is average and I would like to investigate…
15
votes
1 answer

Why do we need to add START + END symbols when using Recurrent Neural Nets for Sequence-to-Sequence Models?

In the Sequence-to-Sequence models, we often see that the START (e.g. ) and END (e.g. ) symbols are added to the inputs and outputs before training the model and before inference/decoding unseen data. E.g.…
alvas
  • 2,340
  • 6
  • 25
  • 38
11
votes
2 answers

How do attention mechanisms in RNNs learn weights for a variable length input

Attention mechanisms in RNNs are reasonably common to sequence to sequence models. I understand that the decoder learns a weight vector $\alpha$ which is applied as a weighted sum of the output vectors from the encoder network. This is used to…
9
votes
2 answers

Input for LSTM for financial time series directional prediction

I'm working on using an LSTM to predict the direction of the market for the next day. My question concerns the input for the LSTM. My data is a financial time series $x_1 \ldots x_t$ where each $x_i$ represents a vector of features for day $i$, i.e…
7
votes
1 answer

Minimal working example or tutorial showing how to use Pytorch's nn.TransformerDecoder for batch text generation in training and inference modes?

I want to solve a sequence-to-sequence text generation task (e.g. question answering, language translation, etc.). For the purposes of this question, you may assume that I already have the input part already handled. (I already have a tensor of…
6
votes
3 answers

How are Q, K, and V Vectors Trained in a Transformer Self-Attention?

I am new to transformers, so this may be a silly question, but I was reading about transformers and how they use attention, and it involves the usage of three special vectors. Most articles say that one will understand their purpose after reading…
5
votes
1 answer

ValueError: Cannot convert a partially known TensorShape to a Tensor: (?, 256)

I'm working on a sequence to sequence approach using LSTM and a VAE with an attention mechanism. p = np.random.permutation(len(input_data)) input_data = input_data[p] teacher_data = teacher_data[p] target_data = target_data[p] BUFFER_SIZE =…
5
votes
2 answers

Does this encoder-decoder LSTM make sense for time series sequence to sequence?

TASK given $\vec x = [x_{t=-3}, x_{t=-2}, x_{t=-1}, x_{t=0}]$ predict $\vec y = [x_{t=1}, x_{t=2}]$ Whith an LSTM encoder-decoder (seq2seq) MODEL NOTE: the ? symbol in the shape of the tensors refers to batch_size, following tensorflow…
ignatius
  • 1,638
  • 7
  • 21
5
votes
1 answer

How/What to initialize the hidden states in RNN sequence-to-sequence models?

In an RNN sequence-to-sequence model, the encode input hidden states and the output's hidden states needs to be initialized before training. What values should we initialize them with? How should we initialize them? From the PyTorch tutorial, it…
alvas
  • 2,340
  • 6
  • 25
  • 38
4
votes
2 answers

Sentences language translation with neural network, with a simple layer structure (if possible sequential)

Context: Many language sentences translation systems (e.g. French to English) with neural networks use a seq2seq structure: "the cat sat on the mat" -> [Seq2Seq model] -> "le chat etait assis sur le tapis" Example: A ten-minute introduction to…
4
votes
1 answer

Answer to Question

Looking for a system which can generate answers to questions. Most systems and blogs posted on internet are on Question to answer but not on answer to question or paraphrasing or keyword to questions. Seq2Seq I tried and even after training for many…
Sandeep Bhutani
  • 884
  • 1
  • 7
  • 22
4
votes
1 answer

SymbolicException: Inputs to eager execution function cannot be Keras symbolic tensors

I am writing Encoder-Decoder architecture with Bahdanau Attention using tf.keras with TensorFlow 2.0. Below is my code This is working with TensorFlow 1.15 but getting the error in 2.0. you can check the code in colab notebook here. can you please…
4
votes
1 answer

Why do position embeddings work?

In the papers "Convolutional Sequence to Sequence Learning" and "Attention Is All You Need", positions embeddings are simply added to the input words embeddings to give the model a sense of the order of the input sequence. These position embeddings…
4
votes
2 answers

Is this a problem for a Seq2Seq model?

I'm struggling to find a tutorial/example which covers using an seq2seq model for sequential inputs other then text/translation. I have a multivariate dataset with n number of input variables each composed of sequences, and a single output sequence…
Ellio
  • 93
  • 1
  • 6
3
votes
1 answer

Multi-output, multi-timestep sequence prediction with Keras

I've been searching for about three hours and I can't find an answer to a very simple question. I have a time series prediction problem. I am trying to use a Keras LSTM model (with a Dense at the end) to predict multiple outputs over multiple…
1
2 3
8 9