Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

publications

Selected reaction monitoring approach for validating peptide biomarkers

Proceedings of the National Academy of Sciences, 2017

We here describe a selected reaction monitoring (SRM)-based approach for the discovery and validation of peptide biomarkers for cancer. The first stage of this approach is the direct identification of candidate peptides through comparison of proteolytic peptides derived from the plasma of cancer patients or healthy individuals. Several hundred candidate peptides were identified through this method, providing challenges for choosing and validating the small number of peptides that might prove diagnostically useful. To accomplish this validation, we used 2D chromatography coupled with SRM of candidate peptides. We applied this approach, called sequential analysis of fractionated eluates by SRM (SAFE-SRM), to plasma from cancer patients and discovered two peptides encoded by the peptidyl-prolyl cis–trans isomerase A (PPIA) gene whose abundance was increased in the plasma of ovarian cancer patients. At optimal thresholds, elevated levels of at least one of these two peptides was detected in 43 (68.3%) of 63 women with ovarian cancer but in none of 50 healthy controls. In addition to providing a potential biomarker for ovarian cancer, this approach is generally applicable to the discovery of peptides characteristic of various disease states.

Recommended citation: Wang, Q., Zhang, M., Tomita, T., Vogelstein, J.T., Zhou, S., Papadopoulos, N., Kinzler, K.W. and Vogelstein, B. (2017). "Selected reaction monitoring approach for validating peptide biomarkers." Proceedings of the National Academy of Sciences. 114(51). https://www.pnas.org/doi/full/10.1073/pnas.1712731114

Sparse Projection Oblique Random Forests

Journal of Machine Learning Research, 2020

Decision forests, including Random Forests and Gradient Boosting Trees, have recently demonstrated state-of-the-art performance in a variety of machine learning settings. Decision forests are typically ensembles of axis-aligned decision trees; that is, trees that split only along feature dimensions. In contrast, many recent extensions to decision forests are based on axis-oblique splits. Unfortunately, these extensions forfeit one or more of the favorable properties of decision forests based on axis-aligned splits, such as robustness to many noise dimensions, interpretability, or computational efficiency. We introduce yet another decision forest, called "Sparse Projection Oblique Randomer Forests" (SPORF). SPORF uses very sparse random projections, i.e., linear combinations of a small subset of features. SPORF significantly improves accuracy over existing state-of-the-art algorithms on a standard benchmark suite for classification with > 100 problems of varying dimension, sample size, and number of classes. To illustrate how SPORF addresses the limitations of both axis-aligned and existing oblique decision forest methods, we conduct extensive simulated experiments. SPORF typically yields improved performance over existing decision forests, while mitigating computational efficiency and scalability and maintaining interpretability. Very sparse random projections can be incorporated into gradient boosted trees to obtain potentially similar gains.

Recommended citation: Tomita, Tyler et al. (2020). "Sparse Projection Oblique Randomer Forests." Journal of Machine Learning Research. 21(104). https://www.jmlr.org/papers/volume21/18-664/18-664.pdf

Robust Similarity and Distance Learning via Decision Forests

arXiv preprint, 2020

Canonical distances such as Euclidean distance often fail to capture the appropriate relationships between items, subsequently leading to subpar inference and prediction. Many algorithms have been proposed for automated learning of suitable distances, most of which employ linear methods to learn a global metric over the feature space. While such methods offer nice theoretical properties, interpretability, and computationally efficient means for implementing them, they are limited in expressive capacity. Methods which have been designed to improve expressiveness sacrifice one or more of the nice properties of the linear methods. To bridge this gap, we propose a highly expressive novel decision forest algorithm for the task of distance learning, which we call Similarity and Metric Random Forests (SMERF). We show that the tree construction procedure in SMERF is a proper generalization of standard classification and regression trees. Thus, the mathematical driving forces of SMERF are examined via its direct connection to regression forests, for which theory has been developed. Its ability to approximate arbitrary distances and identify important features is empirically demonstrated on simulated data sets. Last, we demonstrate that it accurately predicts links in networks.

Recommended citation: Tomita, Tyler M. and Joshua T. Vogelstein. (2022). "Robust Similarity and Distance Learning via Decision Forests." arXiv preprint. https://arxiv.org/pdf/2007.13843.pdf

The Similarity Structure of Real-World Memories

biorXiv preprint, 2021

How do we mentally organize our memories of life events? Two episodes may be connected because they share a similar location, time period, activity, spatial environment, or social and emotional content. However, we lack an understanding of how each of these dimensions contributes to the perceived similarity of two life memories. We addressed this question with a data-driven approach, eliciting pairs of real-life memories from participants. Participants annotated the social, purposive, spatial, temporal, and emotional characteristics of their memories. We found that the overall similarity of memories was influenced by all of these factors, but to very different extents. Emotional features were the most consistent single predictor of overall memory similarity. Memories with different emotional tone were reliably perceived to be dissimilar, even when they occurred at similar times and places and involved similar people; conversely, memories with a shared emotional tone were perceived as similar even when they occurred at different times and places, and involved different people. A predictive model explained over half of the variance in memory similarity, using only information about (i) the emotional properties of events and (ii) the primary action or purpose of events. Emotional features may make an outsized contribution to event similarity because they provide compact summaries of an event’s goals and self-related outcomes, which are critical information for future planning and decision making. Thus, in order to understand and improve real-world memory function, we must account for the strong influence of emotional and purposive information on memory organization and memory search.

Recommended citation: Tomita, Tyler M., Barense, Morgan D., and Honey, Christopher J. (2021). "The Similarity Structure of Real-World Memories." Submitted.. https://www.biorxiv.org/content/biorxiv/early/2021/01/30/2021.01.28.428278.full.pdf

Contextual Representation Ensembling for Continual Lifelong Learning

Conference on Cognitive Computational Neuroscience 2022, 2022

Real-world agents must be able to efficiently acquire new skills over a lifetime, a process called "continual learning." Current continual machine learning models fall short because they do not selectively and flexibly transfer prior knowledge representations to novel contexts. We propose a cognitively-inspired model called Contextual Representation Ensembling (CRE), which fills this gap. We compared CRE to other state-of-the-art continual machine learning models as well as other baseline models on a simulated continual learning experiment. CRE demonstrated superior transfer to novel contexts and superior remembering when old contexts are re-encountered. Our results suggest that, in order to achieve efficient continual learning in the real world, an agent must have two abilities: (i) they must be able to recognize context cues within the environment in order to infer what prior knowledge might be relevant to the current context and (ii) they must be able to flexibly recombine prior knowledge.

Recommended citation: Tomita, Tyler. (2022). "Contextual Representation Ensembling." Conference on Cognitive Computational Neuroscience 2022. https://2022.ccneuro.org/proceedings/0000134.pdf

Representation Ensembling for Synergistic Lifelong Learning with Quasilinear Complexity

preprint, 2022

In lifelong learning, data are used to improve performance not only on the current task, but also on previously encountered, and as yet unencountered tasks. In contrast, classical machine learning, which we define as, starts from a blank slate, or \textit{tabula rasa} and uses data only for the single task at hand. While typical transfer learning algorithms can improve performance on future tasks, their performance on prior tasks degrades upon learning new tasks (called forgetting). Many recent approaches for continual or lifelong learning have attempted to \textit{maintain} performance on old tasks given new tasks. But striving to avoid forgetting sets the goal unnecessarily low. The goal of lifelong learning should be not only to improve performance on future tasks (forward transfer) but also on past tasks (backward transfer) with any new data. Our key insight is that we can synergistically ensemble representations—that were learned independently on disparate tasks—to enable both forward and backward transfer. This generalizes ensembling decisions (like in decision forests) and complements ensembling dependently learned representations (like in multitask learning). Moreover, we can ensemble representations in quasilinear space and time. We demonstrate this insight with two algorithms: representation ensembles of (1) trees and (2) networks. Both algorithms demonstrate forward and backward transfer in a variety of simulated and benchmark data scenarios, including tabular, image, and spoken, and adversarial tasks. This is in stark contrast to the reference algorithms we compared to, most of which failed to transfer either forward or backward, or both, despite that many of them require quadratic space or time complexity.

Recommended citation: Vogelstein, J., Dey, J., Helm, H., Levine, W., Mehta, R, Tomita, T.M., Xu, H. Geisa, A., van de Ven, G., Chang, E., Gao, C., Yang, W., Tower, B., Larson, J., White, C.M., and Priebe, C.E. (2022). "Representation Ensembling for Synergistic Lifelong Learning with Quasilinear Complexity." Submitted.. http://tyler-tomita.github.io/files/llf.pdf

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.