Research Agenda

Author
Affiliation

NYU Abu Dhabi, NYU Tandon

Published

September 24, 2025

Can we represent the nonlinear layers in neural networks as linear operations?

Progress
  • We developed layer scaling as a mechanism to “stretch” the steps of a dynamical into smaller increments.

  • Taking advantage of layer scaling, we used time delay embedding as an observable to feed into a dynamic mode decomposition procedure.
  • We individually replaced layers in two classifier networks (MNIST and YinYang) with varying success in preserving accuracy.
  • In the case of the YinYang network, we could visualize how the DMD replacements affected the final decision boundaries of the network.

Current Challenges
Open Questions
  • How do we produce a mechanistic understanding of the layer? What does this layer “do”?
  • What does the varying success rate in hybridization across layers of a network say about the role of each layer? Why are some layers easier to replace than others?
  • Can we use this method to “edit” trained models?
  • What are the roadblocks in extending to convolution and attention?

Can we improve the interpretability of sparse autoencoders with a sparse Koopman variant?

Progress
  • We implemented a Koopman sparse autoencoder with the three losses (reconstruction, linear prediction, dynamic prediction) defined in (BruntonNaturePaper?) and trained it on the activations of an MLP layer from GPT-2 with \(n=32 * 768\).
  • Initial experiments are documented here.
Current Challenges
  • Koopman autoencoders typically learn a \(\mathcal{K} \in \mathbb{R}^{n\times n}\) matrix, which is very costly when the latent dimension \(n\) is large. Learning the components of \(\mathcal{K}\) via DMD is not immediately feasible because that would require computing decompositions for each batch of data, which is even more costly.
  • Relative to a vanilla SAE, the number of “alive” dictionary components are very low for a trained KSAE. To remedy this, we should run sweeps and implement the feature density histogram metric
Open Questions
  • Are we able to see how the dictionary components are “stepped” forward by the MLP in a transformer block?

By analyzing the dynamics of a single vehicle, can we discover the dynamics of other vehicles?

Progress
  • TBD.
Current Challenges
  • TBD.
Open Questions
  • TBD.

Citation

BibTeX citation:
@online{aswani2025,
  author = {Aswani, Nishant},
  title = {Research {Agenda}},
  date = {2025-09-24},
  url = {https://nishantaswani.com/articles/agenda.html},
  langid = {en}
}
For attribution, please cite this work as:
Aswani, Nishant. 2025. “Research Agenda.” September 24, 2025. https://nishantaswani.com/articles/agenda.html.