122 Years of Moore’s Law + Tesla AI Update
122 Years of Moore’s Law + Tesla AI Update
Tesla now holds the mantle of Moore’s Law, with the D1 chip introduced last night for the DOJO supercomputer (video, news summary).
This should not be a surprise, as Intel ceded leadership to NVIDIA a decade ago, and further handoffs were inevitable. The computational frontier has shifted across many technology substrates over the past 120 years, most recently from the CPU to the GPU to ASICs optimized for neural networks (the majority of new compute cycles). The ASIC approach is being pursued by scores of new companies and Google TPUs now added to the chart by popular request (see note below for methodology), as well as the Mythic analog M.2
Of all of the depictions of Moore’s Law, this is the one I find to be most useful, as it captures what customers actually value — computation per $ spent (note: on a log scale, so a straight line is an exponential; each y-axis tick is 100x).
Humanity’s capacity to compute has compounded for as long as we can measure it, exogenous to the economy, and starting long before Intel co-founder Gordon Moore noticed a refraction of the longer-term trend in the belly of the fledgling semiconductor industry in 1965.
Why the transition within the integrated circuit era? Intel lost to NVIDIA for neural networks because the fine-grained parallel compute architecture of a GPU maps better to the needs of deep learning. There is a poetic beauty to the computational similarity of a processor optimized for graphics processing and the computational needs of a sensory cortex, as commonly seen in neural networks today. A custom chip (like the Tesla D1 ASIC) optimized for neural networks extends that trend to its inevitable future in the digital domain. Further advances are possible in analog in-memory compute, an even closer biomimicry of the human cortex. The best business planning assumption is that Moore’s Law, as depicted here, will continue for the next 20 years as it has for the past 120.
For those unfamiliar with this chart, here is a more detailed description:
Moore’s Law is both a prediction and an abstraction
Moore’s Law is commonly reported as a doubling of transistor density every 18 months. But this is not something the co-founder of Intel, Gordon Moore, has ever said. It is a nice blending of his two predictions; in 1965, he predicted an annual doubling of transistor counts in the most cost effective chip and revised it in 1975 to every 24 months. With a little hand waving, most reports attribute 18 months to Moore’s Law, but there is quite a bit of variability. The popular perception of Moore’s Law is that computer chips are compounding in their complexity at near constant per unit cost. This is one of the many abstractions of Moore’s Law, and it relates to the compounding of transistor density in two dimensions. Others relate to speed (the signals have less distance to travel) and computational power (speed x density).
Unless you work for a chip company and focus on fab-yield optimization, you do not care about transistor counts. Integrated circuit customers do not buy transistors. Consumers of technology purchase computational speed and data storage density. When recast in these terms, Moore’s Law is no longer a transistor-centric metric, and this abstraction allows for longer-term analysis.
What Moore observed in the belly of the early IC industry was a derivative metric, a refracted signal, from a longer-term trend, a trend that begs various philosophical questions and predicts mind-bending futures.
Ray Kurzweil’s abstraction of Moore’s Law shows computational power on a logarithmic scale, and finds a double exponential curve that holds over 120 years! A straight line would represent a geometrically compounding curve of progress.
Through five paradigm shifts – such as electro-mechanical calculators and vacuum tube computers – the computational power that $1000 buys has doubled every two years. For the past 35 years, it has been doubling every year.
Each dot is the frontier of computational price performance of the day. One machine was used in the 1890 Census; one cracked the Nazi Enigma cipher in World War II; one predicted Eisenhower’s win in the 1956 Presidential election. Many of them can be seen in the Computer History Museum.
Each dot represents a human drama. Prior to Moore’s first paper in 1965, none of them even knew they were on a predictive curve. Each dot represents an attempt to build the best computer with the tools of the day. Of course, we use these computers to make better design software and manufacturing control algorithms. And so the progress continues.
Notice that the pace of innovation is exogenous to the economy. The Great Depression and the World Wars and various recessions do not introduce a meaningful change in the long-term trajectory of Moore’s Law. Certainly, the adoption rates, revenue, profits and economic fates of the computer companies behind the various dots on the graph may go though wild oscillations, but the long-term trend emerges nevertheless.
Any one technology, such as the CMOS transistor, follows an elongated S-shaped curve of slow progress during initial development, upward progress during a rapid adoption phase, and then slower growth from market saturation over time. But a more generalized capability, such as computation, storage, or bandwidth, tends to follow a pure exponential – bridging across a variety of technologies and their cascade of S-curves.
In the modern era of accelerating change in the tech industry, it is hard to find even five-year trends with any predictive value, let alone trends that span the centuries. I would go further and assert that this is the most important graph ever conceived.
Why is this the most important graph in human history?
A large and growing set of industries depends on continued exponential cost declines in computational power and storage density. Moore’s Law drives electronics, communications and computers and has become a primary driver in drug discovery, biotech and bioinformatics, medical imaging and diagnostics. As Moore’s Law crosses critical thresholds, a formerly lab science of trial and error experimentation becomes a simulation science, and the pace of progress accelerates dramatically, creating opportunities for new entrants in new industries. Boeing used to rely on the wind tunnels to test novel aircraft design performance. Ever since CFD modeling became powerful enough, design moves to the rapid pace of iterative simulations, and the nearby wind tunnels of NASA Ames lie fallow. The engineer can iterate at a rapid rate while simply sitting at their desk.
Every industry on our planet is going to become an information business. Consider agriculture. If you ask a farmer in 20 years’ time about how they compete, it will depend on how they use information, from satellite imagery driving robotic field optimization to the code in their seeds. It will have nothing to do with workmanship or labor. That will eventually percolate through every industry as IT innervates the economy.
Non-linear shifts in the marketplace are also essential for entrepreneurship and meaningful change. Technology’s exponential pace of progress has been the primary juggernaut of perpetual market disruption, spawning wave after wave of opportunities for new companies. Without disruption, entrepreneurs would not exist.
Moore’s Law is not just exogenous to the economy; it is why we have economic growth and an accelerating pace of progress. At Future Ventures, we see that in the growing diversity and global impact of the entrepreneurial ideas that we see each year. The industries impacted by the current wave of tech entrepreneurs are more diverse, and an order of magnitude larger than those of the 90’s — from automobiles and aerospace to energy and chemicals.
At the cutting edge of computational capture is biology; we are actively reengineering the information systems of biology and creating synthetic microbes whose DNA is manufactured from bare computer code and an organic chemistry printer. But what to build? So far, we largely copy large tracts of code from nature. But the question spans across all the complex systems that we might wish to build, from cities to designer microbes, to computer intelligence.
As these systems transcend human comprehension, we will shift from traditional engineering to evolutionary algorithms and iterative learning algorithms like deep learning and machine learning. As we design for evolvability, the locus of learning shifts from the artifacts themselves to the process that created them. There is no mathematical shortcut for the decomposition of a neural network or genetic program, no way to "reverse evolve" with the ease that we can reverse engineer the artifacts of purposeful design. The beauty of compounding iterative algorithms (evolution, fractals, organic growth, art) derives from their irreducibility. And it empowers us to design complex systems that exceed human understanding.
Why does progress perpetually accelerate?
All new technologies are combinations of technologies that already exist. Innovation does not occur in a vacuum; it is a combination of ideas from before. In any academic field, the advances today are built on a large edifice of history. . This is why major innovations tend to be ‘ripe’ and tend to be discovered at the nearly the same time by multiple people. The compounding of ideas is the foundation of progress, something that was not so evident to the casual observer before the age of science. Science tuned the process parameters for innovation, and became the best method for a culture to learn.
From this conceptual base, come the origin of economic growth and accelerating technological change, as the combinatorial explosion of possible idea pairings grows exponentially as new ideas come into the mix (on the order of 2^n of possible groupings per Reed’s Law). It explains the innovative power of urbanization and networked globalization. And it explains why interdisciplinary ideas are so powerfully disruptive; it is like the differential immunity of epidemiology, whereby islands of cognitive isolation (e.g., academic disciplines) are vulnerable to disruptive memes hopping across, much like South America was to smallpox from Cortés and the Conquistadors. If disruption is what you seek, cognitive island-hopping is good place to start, mining the interstices between academic disciplines.
It is the combinatorial explosion of possible innovation-pairings that creates economic growth, and it’s about to go into overdrive. In recent years, we have begun to see the global innovation effects of a new factor: the internet. People can exchange ideas like never before Long ago, people were not communicating across continents; ideas were partitioned, and so the success of nations and regions pivoted on their own innovations. Richard Dawkins states that in biology it is genes which really matter, and we as people are just vessels for the conveyance of genes. It’s the same with ideas or “memes”. We are the vessels that hold and communicate ideas, and now that pool of ideas percolates on a global basis more rapidly than ever before.
In the next 6 years, three billion minds will come online for the first time to join this global conversation (via inexpensive smart phones in the developing world). This rapid influx of three billion people to the global economy is unprecedented in human history, and so to, will the pace of idea-pairings and progress.
We live in interesting times, at the cusp of the frontiers of the unknown and breathtaking advances. But, it should always feel that way, engendering a perpetual sense of future shock.
By jurvetson on 2021-08-20 19:35:34
Extracting the predictive patterns and relevant information from a huge data is termed as Data Mining. It helps in acquiring patterns that contribute to decision making.There are various information mining (DM) methods and the kind of information being analyzed firmly impacts the sort of information mining system utilized. Bunching alludes to the arrangement of information bunches that are gathered together by some kind of relationship that recognizes that information as being comparative.
Extracting the predictive patterns and relevant information from a huge data is termed as Data Mining. It helps in acquiring patterns that contribute to decision making.
Information is frequently put away in huge, relational databases and the measure of data put away can be significant. In any case, what does this information mean? By what method can an organization or association make sense of examples that are basic to its execution and after that make a move in view of these examples? To physically swim through the data put away in an expansive database and afterward make sense of what is vital to your association can be beside unthinkable.
This is the place information mining systems act the hero! Information mining programming investigates enormous amounts of information and after that decides prescient examples by looking at connections.
Data Mining Techniques
There are various information mining (DM) methods and the kind of information being analyzed firmly impacts the sort of information mining system utilized.
Note that the idea of information mining is continually developing and new DM procedures are being actualized constantly.
For the most part talking, there are a few fundamental procedures utilized by information mining programming: grouping, arrangement, relapse, and affiliation techniques.
Bunching alludes to the arrangement of information bunches that are gathered together by some kind of relationship that recognizes that information as being comparative. A case of this would be dealt information that is grouped into particular markets.
Information is assembled together by applying known structure to the information distribution center being inspected. This strategy is extraordinary for unmitigated data and utilization at least one calculations, for example, choice tree learning, neural systems and “closest neighbor” strategies.
This type uses scientific equations and is wonderful for numerical data. It fundamentally takes a gander at the numerical information and after that endeavors to apply an equation that fits that information.
New information would then be able to be connected to the equation, which brings about the prescient investigation.
Regularly alluded to as “affiliation govern taking in,” this strategy is famous and involves the revelation of intriguing connections between factors in the information stockroom (where the information is put away for investigation). Once an affiliation “lead” has been built up, forecasts would then be able to be made and followed up on. A case of this is shopping: if individuals purchase a specific thing at that point there might be a high shot that they additionally purchase another particular thing (the store supervisor could then ensure these things are situated close to each other).
According to my knowledge SAP has used concepts of data mining in all of business software. To get detail knowledge of SAP video courses, please visit to Tekvdo website.
Embracing Oneness – Q&A Session #1
My husband and I endeavor to conduct our first official Q&A Session. In his words, “This turned out to be a great intro for anyone just coming into this type of information. The information is delivered in a way that is easily digestible.” Check it out!
FB – www.facebook.com/energy.of.oneness
IG – www.instagram.com/energy.of.oneness
Website – www.energyofoneness.com
Book a Private Session – www.onenessworksllc.simplybook.me/v2/