Hello, my name is Norman.


I am a PhD student in Canada. My research focuses on reinforcement learning. The primary objectives of my work are to improve agent performance, data efficiency, and task transferability in complex environments. My research applies to financial markets (algorithmic trading), control systems, and robotics.

In my spare time, I enjoy creating and maintaining profitable trading algorithms. Currently, I am applying reinforcement learning and statistical models to cryptocurrency markets.

Research Papers:

Noisy Importance Sampling Actor-Critic: By injecting noise into the importance sampling ratio, used to weight training samples of an on-policy algorithm, we see improved performance across several tasks in the Atari environment. Further, we show that the noise fundamentally changes how off-policy samples are weighted.

Dynamic Planning Networks: A model that learns to use a state-transition model to dynamically construct plans by optimizing reward and state novelty. This work provides evidence that it is indeed better to learn how to plan in an end-to-end manner.


I worked fulltime at Scaled Inference1 in Palo Alto, CA. My work focused on distributed systems and machine learning.

I am the author of PLE a reinforcement learning environment for python.

Interned at Scaled Inference1 in Palo Alto, CA. My internship focused on the combination of bayesian models and deep neural networks. Additionally, I spent time improving modeling speed by porting code to run on GPUs.

Interned at Flipboard where I created a method for Image Super Resolution. While there I had to create a way to optimize model parameters and did so with bayesian optimization techniques over clusters of GPUs.

Performed research at the University of Western Ontario focusing on anomaly detection with electrical stream data using machine learning methods.


@normantasfi or email (n plus tasfi at google email)

1: ceased operations in 2019.