Norman Tasfi

Hello, my name is Norman.

Present:

I am a MEV & quant researcher at a hedge fund in Manhattan.

I hold a Postdoctoral position at the University of Western, where I research task adaptive deep reinforcement learning algorithms. I am currently advising PhD candidate projects in RL.

Research Papers:

Dynamic Successor Features for transfer learning and guided exploration: we propose a reformulation of the successfor feature model such that supervised state-transition models can be used while enabling dynamic parameterization of the discount and policy.

Policy Agnostic Successor Features: We propose a series of adjustments to the successor feature framework that allows the use of a state-transition model to dynamically create the successor features in a policy agnostic manner.

Second-Order Rewards For Successor Features: We introduce a novel formulation of the successor feature framework that models the reward as a non-linear combination of state features. The new formulation provides additional flexibility and improves performance when state features are non-perfect. A new quantity emerges that can model the environment stochasticity and can be used for guided exploration.

Noisy Importance Sampling Actor-Critic: By injecting noise into the importance sampling ratio, used to weight training samples of an on-policy algorithm, we see improved performance across several tasks in the Atari environment. Further, we show that the noise fundamentally changes how off-policy samples are weighted.

Dynamic Planning Networks: A model that learns to use a state-transition model to dynamically construct plans by optimizing reward and state novelty. This work provides evidence that it is indeed better to learn how to plan in an end-to-end manner.

Past:

I completed my Ph.D. in Deep Reinforcement Learning (RL). My thesis, titled "Algorithmic Improvements In Deep Reinforcement Learning", presents several algorithms that improve the performance of RL algorithms across multiple performance axes.

I worked fulltime at Scaled Inference¹ in Palo Alto, CA. My work focused on distributed systems and machine learning.

I am the author of PLE, a reinforcement learning environment for python with over 50 academic citations and 900+ github stars.

Interned at Scaled Inference¹ in Palo Alto, CA. My internship focused on the combination of bayesian models and deep neural networks. Additionally, I spent time improving modeling speed by porting code to run on GPUs.

Interned at Flipboard where I created a method for Image Super Resolution, for which I recieved a patent. While there I had to create a way to optimize model parameters and did so with bayesian optimization techniques over clusters of GPUs.

Performed research at the University of Western Ontario focusing on anomaly detection with electrical stream data using machine learning methods.

Contact:

@normantasfi or email (n plus tasfi at google email)

1: ceased operations in 2019.