AI and the Future of Programming

Continuing the discussion from Bret Victor - The Future of Programming:

@AndrewCodeDev and I exchanged some private messages on this, so let me quote what I sent to him here:

I think this is enough to kick-off this thread. (Shields are up. Ready for incoming fire.) :rocket:

2 Likes

hello, having done programming tests to see and I even pushed the envelope quite far, I can say that the AI ​​is not up to par, whether in assembler or ZIG.

the ideas are right but the practice is wrong

2 Likes

I don’t want to come with contradictions, because I feel like none of us can possibly know what the future is made of. I’ve watched that talk of yours about the lego. Unfortunately I’m still not convinced, I get that we are at the beginning of AI, and we are still collectively figuring out how to use it. But most of the time I see someone holding the same opinion as you they left me with a lot of unanswered questions.

For example I don’t see how AI will be able to build large scale software when the whole world of software is an ever changing landscape, API gets deprecated, new libraries come and go, techniques and languages evolves.

I remember when I was learning Zig and I tried to use ChatGPT to answer some of my questions and it would answer back to me in Rust, then when the new chatGTP released it would answer me but suggested to use API that had changed since then. Information I don’t see a clear way to solve this kind of issues?

Also when you work on a software where pieces are moving everywhere, how can an AI anticipate that while you are working on that feature in that branch that the other team is completely rewriting a subsidiary part of your software.

Also while chatGPT is very impressive on very “static code”, in my experience it lacks a lot of the creativity and attention to detail that defines good software. How can AI improves in those areas without completely changing the current technique used, because I don’t believe that LLM techniques are going to cut it.

For example you took the game of go as an example and while it’s an impressive achievement I don’t think it maps to software very well. Whether it’s go or chess, those models aren’t beating the game, they are beating the player, humans are flawed and predictable, but I don’t see how that would translate to basically be playing the “world” of software?

My point is while I’m very much optimistic about what AI will bring to the table, and how it will drastically improve both accessibility and productivity. I can’t fathom how software engineering could turn into prompt engineering in the span of a few years. Without incredible breakthrough in technic and hardware.

4 Likes

I already use AI to assist me in my daily work. If you give it enough context you can get some amazing results. It’s only a matter of time to make it perfect, but I don’t know if it’s gonna be that fast, jumping from 99,9 to 99,99 can be extremely difficult.

1 Like

My tests are not limited to one question, but several hours and even more…
For the moment it is not up to par in terms of the code, now if we discuss the functional analysis, his answers are often judicious, but apart from the translations where I find it coherent, I repeat myself he is not up to the task.

I have to agree with the more conservative people here. I don’t really think that “perfection is just a matter of time” when the entire idea of LLM doesn’t seem suitable to achieve this. It’s like trying to optimize some code when the actual problem is the algorithm itself. You’ll get a lot of speed improvements at first, but that algorithm has an upper limit and you’ll hit it. I’m open to surprises, but they have yet to arrive.

3 Likes

This is the most recent article I read about this. https://correctiv.org/en/fact-checking-en/2024/05/23/dont-bother-asking-ai-about-the-eu-elections-how-chatbots-fail-when-it-comes-to-politics/
Of course, this article is only about politics, but if they can’t get simple facts right, how should they write correct software?

1 Like

isn’t most AI stuff written in python ? i think mojo is going to speed things up dramatically, not just because its a much much faster python, but i think mojo is going to attract more developers, and faster ai would be able to learn faster too, so the iterative development would be faster of those llm models

An excellent analogy is LCD technology. When it first emerged, it was groundbreaking a significant advancement that provided an experience no crt TV could match at the time. However, if you compare LCD to something like OLED, you’ll notice that despite the fact that high-end LCD models have improved tremendously, the fundamental limitations lie more in the physics of the technology rather than the manufacturing process.

Regarding AI, I’ve talked a lot with my best friend, who is a researcher at Hugging Face and while he is very excited and optimistic about the field, he agrees with a lot of the points made in this talk: Yann LeCun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI | Lex Fridman Podcast #416.

TL;DR: Most of the progress we see today stems from techniques that already exhibit significant scalability limitations. Unless we discover a new approach to AI, I find it difficult to envision how we will advance beyond the capabilities of current systems like ChatGPT.

We will undoubtedly improve the quality of responses, increase the context window size, and introduce new interaction modes. However, AI currently lacks the ability to generate original thoughts or genuinely understand its actions. It can and will get better at emulating human-like responses, but I don’t foresee it surpassing this because these models are not replicating true intelligence. They don’t need to; they only need to be convincing enough for us to perceive them as intelligent.

11 Likes

makes me think of the vacuum tubes that once were fundamental parts of computers. Waste a lot of space and power to achieve… not a lot compared to today’s standards.

Besides, didn’t the first “neural network” even run on one of those computers?

1 Like

I’m not knowledgeable enough to answer that, but from what I recall, from all the hours of curious research and YouTube videos, some very basic computer program “by today standards” were able to pass the Turing test decades ago. My main point is that there is an enormous jump between emulating 99.999999% of human behavior and actually behaving just like one. And as long as we are going to be the “benchmark” the field won’t need to put tremendous effort to fool us into thinking that “this is it”. Take a magician for example you know there is some sleight of hand involved into finding your card back in the deck. But it’s not because you can’t explain it that it makes it real “magic” and I think current AI techniques are just like the magician sleight of hand.

1 Like

Rate of progress

I think it is to simplistic to see rate of progress as a simple value, I think the rate at which solving different tasks to better degrees will be very different depending on the tasks and making improvements will become more and more difficult, it also depends a lot on how well understood these tasks are already and how many free learning materials exist for these tasks.

LLMs

Repeating the implementation for some well documented existing algorithm isn’t difficult (being able to quickly find it is still useful, but these technologies are marketed as oracles (and this irks me, because it seems to claim more capability than it actually has), instead of fuzzy re-permutating search engines (which in many cases seems like a more accurate description of the results of LLMs to me))

I haven’t really seen AI demonstrations where the AI is able to produce new content well, it always requires pre-existing large heaps of data which are then basically made searchable and permutatable. What I haven’t really seen demonstrated is something that is actually able to understand the concepts in that material to a precise level and then use that to come up with new hypothesis-es, design experiments to test those and effectively come up with new thoughts that way.

Instead it always seem like we are taking the quality embedded in the learning material, losing at least a bit of the quality and precision and then get a result.

So at least with the LLM style models, the biggest benefit seems to be searchability of knowledge (while sacrificing a bit or a lot of precision).

Reasoning

The thing is, if it can’t improve the quality of the knowledge, by actually reasoning about it (finding contradictions, logic errors, errors in formal reasoning), then the quality of the results won’t just magically become better than what was originally in the training data and it is likely that it will be worse.

quality of training data

An additional problem is, that with all the people posting their chat gpt results as their own answers in forums and comments, without having a machine readable tag that identifies it as something that came from an LLM, we will have more and more LLMs which are trained on the smoothed-out/blurred results of other LLMs.

I suspect that this will make it more and more difficult to use website scraping, which seems to be the preferred method of building the big training sets, where somebody has actually thought through the claims in what was written/scraped.

If more of the input to the training data is just hallucinated gibberish and indistinguishable from other sentences that were actually formed by somebody that had knowledge about the topic. Future LLMs will get worse results, unless they make use of old data sets that contain less hallucinations.

Filtering for quality

There probably will be techniques in the future to classify things more, or maybe do actual reasoning to filter out meaningless noise. (But I haven’t seen something that actually demonstrates something like that)

But I think at some point you will hit a very difficult barrier, where it is extremely difficult to say what is a useful signal vs just noise and I think these hard problems will stall out certain types of AI progress significantly.

computability

I wouldn’t be surprised if overcoming some of those barriers, might even require things like quantum computers so that you can tackle some problems which are just impossible to do with classical computers, in reasonable time.

So I think fundamentally this topic would also have to tackle the whole topic of, different problems and how computable they are, in terms of how hard the problem is and what is required to solve it. Whether we can only create good approximations, or can’t even do that (Intractability).

applying appropriate techniques

And even if we can do specific things, the AI would have to be able to apply different techniques based on what makes sense, based on the problem.

(I imagine we will get there eventually, but I wonder how much of it, will be clever people teaching the AI, when to do what, vs people creating some higher level behavior loop, that somehow results in the AI being able to teach itself eventually)

Breakthroughs followed by plateaus

Personally I find it more likely that we will repeatedly find breakthroughs, that will create a surge of progress, which then a bit later is followed by a plateau, where the existing techniques stall out, until somebody finds a new way to improve something.

Marketing and Hype

I also find the whole everybody shouting they will achieve AGI by date so and so, gives me the vibes of vaporware salesmen and scam artists, hoping for quick investments, before they then ultimately do a rug-pull and disappear, instead of delivering anything, that actually implements what they promised.

At least to me, it seems like there is a high likelihood, that many of those claims are just marketing ploys to trick people into investing, tricking people into hoping they will be part of a gold rush. When the gold was just placed in the river where they were given a tour.

tl;dr

I agree with the more conservative answers, I think AI will be useful, but a lot of things seem like over-promising a technology, by either wanting to trick a group of people into thinking it is more than it is, or people being enthusiastic about it, extrapolating it linearly and thus hoping for a much higher improvement, than is likely without developing new or many specific techniques.

6 Likes

quality of training data

…

As machine learning gets more investment and the quality of content on the internet decreases, the more it’s worth it to pay to create/license data generated by people. There are already billion dollar companies whose whole business model is providing high quality data. You mentioned this in your comment:

(I imagine we will get there eventually, but I wonder how much of it, will be clever people teaching the AI, when to do what, vs people creating some higher level behavior loop, that somehow results in the AI being able to teach itself eventually)

(edit: see also LLMs aren’t just trained on the internet anymore)

This won’t lead to super-intelligence, but it’d be interesting to see the capabilities of a model specifically trained on programming by professional programmers. I think this is basically inevitable as the field matures, and could lead to better design “instincts” than LLMs exhibit now. Currently, LLMs feel like an advanced version of Stack Overflow. At best, they provide clear, functional code, but this code often lacks the design considerations that an experienced programmer would include. It might be the case that 5 to 10 years from now, experienced programmers help train models. The main question is one of scale.

I think antirez’s post on the subject is still the best. But some of the limitations shown in that article could be mitigated as training data and techniques change.

1 Like

I’m more conservative as well. I’ve tried code helpers with a number of languages and they can do some pretty cool stuff. It definitely makes the developers job easier. And if you stick to the mainstream languages, you can get a lot done without much effort. The main challenge there is that you don’t understand it when it breaks. And things will break. We are dealing with software after all.

A couple issues I see:

Production Software is different from hobby software

Building a small site or app is fine. It doesn’t take too much effort, can ignore edge cases for the most part, and can get you started. And most of the AI demos are in this stage. It’s impressive and a great way to get an idea started.

Production software is a lot different. Autonomous AI work in a monorepo is going to cause a lot of problems. Some of this will be worked out: larger context windows, better RAG to get relevent snippets in memory, etc. But I think the fundamental issue will remain. Not to mention CVEs. All AI does is put together the next plausible string of characters. This could easily introduce security vulnerabilities, making them more widespread.
Multi-modal will help with this some: have an AI that can scan for security issues that double checks the work. So we can get some extra coverage, but I think the underlying issue will still remain, albeit to a smaller and smaller degree.

Transferance of learning

This is the big issue as I see it. I can’t take a LLM that has been “taught” C and have it apply that knowledge to Zig, or even Rust for that matter. The concepts cannot be transfered to another context. You can train it on both languages, sure, but does that allow transference of concepts? Not really. You are just teaching it how Zig code has been written.

@pierrelgol has a good explanation of it. The underlying technology has severe limitations in this space. I don’t think this is just a technology/algorithm issue either; I think there is something fundamental to biological intelligence that allows for transference. I don’t think a computer will ever be able to achieve it.

3 Likes

This is also the conclusion that Yann LeCun has come to, from what I remember he said that basically current LLM are trained on gigabytes of text etc, and yet they can’t achieve the practical reasoning of a 5yo. His hypothesis is that most of our learning doesn’t come from text or voice, but from the rest of our senses, and they are working on new methods what will try to teach AI with other “senses”.

I think we need to be careful with claims from people who have a vested interest in promoting their own AI products. They have a great incentive to cherry-pick or even fake the capabilities of their AI.

Take for example the lego house. It sure looks impressive how the AI takes an application that only shows a single lego brick and expands it to show an entire house that is nicely animated.

But I took a closer look, and at 8:22 you can see that the repository already has functionality that describe the lego house. I would guess that all it really had to do was add a for loop that iterates through all the bricks in the predefined structure and add them to the renderer.

Honestly that doesn’t appear to be nearly as impressive. Sure it is probably still a useful tool, but it still seems far off from being able to work on software without being supervised by an actual programmer.

6 Likes

Even if you give AI more “senses” I don’t think that will fix the problem. I think its more of a Knowledge vs Wisdom dichotomy: AIs build up a lot of knowledge, but don’t have the wisdom part of it. Wisdom largely comes from lived experience (so yes, “senses” may help a bit hear) and from perceiving the results and how they interact with other things. Maybe perceiving will be achieved to some degree, who knows, but I don’t think empathy will. Empathy is not something that can be emulated and therefore will be missed by these systems.

2 Likes

No amount of progress on LLMs will produce general artificial intelligence. The architecture isn’t suitable for it. A good paper on the subject is GPT4 can’t reason, there are several other explorations of the problems here.

Chatbots are a useful adjunct to writing programs, no doubt about that. Depending on the task at hand, and the language I’m working with, they can save a lot of time. As an example, I seldom bother thinking too hard about a regular expression anymore, I just drop some sample data and the goal into ChatGPT and it usually works.

Because this sort of ability in software is unprecedented, and very recent, some people get excited, and don’t see the unbridgeable gap between synthesizing a function or two and programming as a professional endeavor. Unbridgeable by LLMs, I mean.

I don’t have any reason to think that it isn’t possible to make a true artificial intelligence. We might even live to see that, there are certainly many of the smartest people around working on figuring it out. My prediction is that if we do see that, we’ll see that LLMs were an interesting step on that journey, but ultimately, a dead end. It will take an architectural breakthrough, and those happen when they happen.

1 Like

I just watched the latest Video from The Internet of Bugs, that fits very well into this discussion, albeit it is also conservative or “ranty” on LLMs:

I enjoy The Internet of Bugs channel very much, not only for AI topics, but general Software Engineering ones.

By senses I mean, hearing, touch, vision, they want to basically train them like “human” and their hypothesis, is that by the age of 5 most kids have a good grasp of basic physics, despite not being genius, whereas to even begin to grasp what will happen if you lift an apple and let it go, an AI will have to ingest gigabytes if not terabytes of video just to begin to “”“”““understand””“”“” that gravity is the force that attracts object to the ground.

Whereas within a few drop of something most kids have figured out that if you let go of something it goes to the ground. This is a very simplistic summary, but the idea is simply that there is no reason to believe that feeding more and more terabytes of data is the solution, instead we should explore a broader more “life like” training with a broader set of inputs. At least that’s Yann LeCun hypothesis.