Norman Tasfi

Hello, my name is Norman.

Present:

I am an advisor to a robotics focused startup, currently in stealth, that has grown immensely (0 to >250m val) and is critical to many fortune 500 companies.

I am an angel investor, interested in early stage deep tech startups. Please feel free to reach out to me² as either an investor and/or advisor.

Research Papers:

Dynamic Successor Features for transfer learning and guided exploration: we propose a reformulation of the successor feature model such that supervised state-transition models can be used while enabling dynamic parameterization of the discount and policy.

Policy Agnostic Successor Features: We propose a series of adjustments to the successor feature framework that allows the use of a state-transition model to dynamically create the successor features in a policy agnostic manner.

Second-Order Rewards For Successor Features: We introduce a novel formulation of the successor feature framework that models the reward as a non-linear combination of state features. The new formulation provides additional flexibility and improves performance when state features are non-perfect. A new quantity emerges that can model the environment stochasticity and can be used for guided exploration.

Noisy Importance Sampling Actor-Critic: By injecting noise into the importance sampling ratio, used to weight training samples of an on-policy algorithm, we see improved performance across several tasks in the Atari environment. Further, we show that the noise fundamentally changes how off-policy samples are weighted.

Dynamic Planning Networks: A model that learns to use a state-transition model to dynamically construct plans by optimizing reward and state novelty. This work provides evidence that it is indeed better to learn how to plan in an end-to-end manner.

Past:

I was a senior member of technical staff at a proprietary quantitative trading firm in Hong Kong. I touch everything and anything from business deals to alpha generation, only objective is generating outsized returns for time and capital invested.

I was a quant researcher at a hedge fund in Manhattan, focused on high-frequency alpha on long-tail asset classes.

I held a Postdoctoral position at the University of Western, where I researched task adaptive deep reinforcement learning algorithms. I advised PhD candidate projects in RL until their graduation, with their thesis work focused on making agents safer in unpredictable environments.

I completed my Ph.D. in Deep Reinforcement Learning (RL). My thesis, titled "Algorithmic Improvements In Deep Reinforcement Learning", presents several algorithms that improve the performance of RL algorithms across multiple performance axes.

I worked fulltime at Scaled Inference¹ in Palo Alto, CA. My work focused on distributed systems and machine learning.

I am the author of PLE, a reinforcement learning environment for python with over 50 academic citations and 900+ github stars.

Interned at Scaled Inference¹ in Palo Alto, CA. My internship focused on the combination of bayesian models and deep neural networks. Additionally, I spent time improving modeling speed by porting code to run on GPUs.

Interned at Flipboard where I created a method for Image Super Resolution, for which I recieved a patent. While there I had to create a way to optimize model parameters and did so with bayesian optimization techniques over clusters of GPUs.

Performed research at the University of Western Ontario focusing on anomaly detection with electrical stream data using machine learning methods.

Contact:

email (n plus tasfi at google email)

1: ceased operations in 2019.
2: Prepend emails with [ANGEL].