Innovation Lens

automated hypothesis generation

Quickstart: choose a topic from the buttons below (more topics coming soon). The site is not yet mobile-friendly; use a laptop or desktop with a mouse or trackpad. On the data map of recent scientific articles, click and drag to pan, scroll to zoom, and shift-click and drag a lasso to choose an area to focus on. The number of points selected is shown at the bottom right of the screen. For best results, select at least 500 points. When you are happy with your selection, click "Confirm" to begin calculation of future abstracts. When you click "Confirm", calculation may take a few seconds. Please be patient.

FAQ

Description of Service

InnovationLens offers a way to gain an overview of existing scientific research and automatically generate hypotheses for future research.

Who is this for?

Professional scientists and students alike will find InnovationLens a helpful aid when deciding their next research project: our algorithm helps identify topics that are innovative and not yet covered in the literature, while also being close enough to recent research that they fit within a school of thought and a tradition. Other professionals who are interested in near-future trends, like hedge fund analysts and venture capitalists, are also likely to find our service useful. And officials in granting agencies and university research funding offices would benefit from our algorithm in order to optimize their use of resources for the topics that are most likely to produce high-value research.

It sounds too good to be true. How does it work?

We discovered patterns within the distribution of scientific research that predict where high-value research is likely to occur in the near future (the next 1-6 years). We have demonstrated that the predictive value of these patterns is significantly higher than two baselines: the results a researcher would have by studying topics distributed evenly across a given domain, and the results that would be achieved by remaining within the distribution of topics that have already been studied in the same given domain. Our algorithm is able to reliably indicate topics that lie close enough to existing research to be well-supported, and far enough away to be innovative and possibly the highly-cited founders of a new line of research. More details on the statistical analysis underlying these claims will be found in our forthcoming white paper.

Once we discovered these patterns, the question became: how do we communicate the meaning of a "location" in the abstract mathematical space we found this pattern in? How can this pattern become actionable insight for researchers who are trying to optimize their time, energy, and resources to do the best research they can do? It took a few years of work to answer that question, and work is ongoing. InnovationLens presents human-readable text descriptions of the topics our algorithm has determined likely to be the location of high-value research in the future.

So maybe you don’t want to explain how you discover these patterns. But once you find them, how do you write “future abstracts” for articles that have yet to be written?

Large language models can be understood as high-dimensional vector spaces that contain huge amounts of information—in some cases, nearly all the words humans have ever written down. These spaces and their properties contain many surprises. Our hypothesis is that in some cases, novel vectors (which point to "locations" in the space of the language model) that do not correspond to any existing text can still contain valid information, because the space in which they exist is itself so information rich. We therefore identify vectors that correspond to our predictive points, and reverse engineer the vector in order to produce human-readable text. Research on reverse engineering is ongoing, and we expect to roll out improvements to this algorithm.

Why should I pay for this service?

Many students and researchers are aware that there are too many articles on any given topic to read. Even very small niches are packed with so many articles that it is very challenging for a new researcher to know where a significant contribution can be made. And the situation is not much better for seasoned researchers: so many articles are published each year that no one can stay up to date. This situation begs the question: could we do better with the aid of automatic text analysis? Some other attempts to map the research space exist: Nomic’s Atlas is one notable example of a visualization similar to ours. However, InnovationLens is the only service capable of suggesting “future abstracts” to suggest topics for future research.

InnovationLens is a sophisticated way to give researchers a “lens” with which to view the vast spaces of existing research, and bring into focus the topics that are ripe for innovation. It is computationally intensive, and GPUs aren’t free, so we have to charge to cover our costs. But we think the low cost of the service provides high value: for the price of a few lattes, we offer a panoramic view of ideas that probably would not all occur to a researcher looking for their next topic. If this small investment can help you dedicate your months and years of research to more fruitful topics, we will have reached our goal.

What research topics have you validated this system on?

The underlying statistical research was done using more than 400,000 articles from the Computer Science section of ArXiv. We will continue to validate and publish our results for other domains as well.

I tried the service, but the quality of the “future abstracts” is low. Why?

We have released this product because our statistical analysis over the entire domain of CompSci articles convinced us that it produces valid results often enough to be useful. The “future abstracts” sometimes contain repetitions, meaningless phrases, or are just too short to be of any use. There are many reasons for this, and we are improving their quality by improving the underlying algorithm, the language models, and through other avenues of our active research. It is also important to note that not all research domains contain enough material for our algorithm to find reliable predictive points. If your selection returns only a handful of predictive points, their quality might be low.

If your results were low quality, try selecting a larger area (=more articles), and try generating more results from the list of predictive points. As our research advances, we will publish more white papers that measure how amenable various subjects are to our method.

I cannot select articles from X years ago. Why?

The sheer volume of scientific articles imposes some design decisions in a consumer-facing product such as InnovationLens. In our validation research, we established that a look-back of six years are sufficient to produce useful results for researchers who want to use our service to help decide their next topics of inquiry. We keep at least the last six years in our graphical tool, and where possible we keep a longer look-back period. If you are interested in backtesting for other purposes, or you want API access to use our algorithm for larger datasets than can be displayed on web browsers, please contact us with your use case.

Why is your site so old-school?

InnovationLens is the work of a small team that doesn’t like front end development and can’t be bothered to learn it. With the exception of the wonderful data map, written by Leland McInnes, the whole site is written in fastHTML. We preferred to spend our time and money on math and GPUs. That said, if you’d like to help us make it smoother and more beautiful, we’re accepting investment proposals for business development :)

Do you have a money back guarantee?

We occasionally offer one-time coupons for new users to try the system free of charge. But we cannot guarantee results that regard the future and cannot be objectively verified without actually doing the research they suggest. By using our system, you acknowledge that it is provided as-is, and we do not make any guarantees about the quality of the results you receive based on your choice of input parameters.

Do you offer API access?

Please contact us at info@innovationlens.org with details of your use case.