My Bona fides: I've written my own Mathematica clone at least twice, maybe three times. Each time I get it parsing expressions and doing basic math, getting to basic calculus. Then I look up the sheer cliff face in front of me and think better of the whole thing.
There is an architectural flaw in Woxi that will sink it hard. Looking through the codebase things like polynomials are implemented in the rust code, not in woxilang. This will kill you long term.
The right approach is to have a tiny core interpreter, maybe go to JIT at some point if you can figure that out. Then implement all the functionality in woxilang itself. That means addition and subtraction, calculus, etc are term rewriting rules written in woxilang, not rust code.
This frees you up in the interpreter. Any improvements you make there will immediately show up over the entire language. It's also a better language to implement symbolic math in than rust.
It also means contributors only need to know one language: woxilang.
No need to split between rust and woxilang.
Mh, I thought about this a little and came actually to exactly the opposite conclusion: Implement as much as possible in Rust to get the fastest code possible. Do you have any more insights why this should not be possible / unsustainable?
You have two distinct products 1) An interpreter 2) a math language.
Don't write your math in some funny imperative computer language.
Keep the interpreters surface area as small as possible. Do some work to make sure you can accelerate numeric, and JIT/compile functions down to something as close to native as you can.
Wolfram, and Taliesin Beynon have both said Wolfram were working internally to get a JIT working in the interpreter loop. Keep the core small, and do that now while it's easy.
Also, it's just easier to write in Mathematica. It's probably 10x smaller than the rust code:
f[x_Integer]:=13*x;
f::help:="Multiplies x by 13, in case you needed an easy function for that."
I love Rust for mathematical and scientific tasks (I am building the structural bio crate infrastructure), and I love Mathematica and have a personal sub. I should be the audience, but... What makes Mathematica great, IMO, is the polish and overall experience created by consistent work with applications in mind over decades. So, I look at this project with skepticism regarding its utility.
Sure, but you've got to start somewhere! And with the amount of progress I was able to make in just a few weeks, I'm very optimistic that the polish will come sooner rather than later.
Based on the list of contributors to your project, I am not sure this starting location is optimally suited to the task of building a foundation for polished, reliable, expandable software.
If I go by the contributor numbers on Github, I see Claude has committed something on the order of 300,000 lines of code. I don't think it's reasonable to review that much code, even in weeks worth of time.
It's a defense mechanism. I was guilty as charge as well initially. Suddenly most of your l33t skillz are trivialized and surpassed by an inhumane actor. It's a tough pill to swallow.
i'm curious if you intend to reimplement highly optimized numerical algorithms, symbolic algorithms, and so on, accumulated and tuned in mathematica since its 1988 release?
it's a huuuuuuuuge amount of technology in the standard library of mathematica, beyond the surface syntax and rewrite system, i mean.
I am not sure Octave ever had to put on that much polish. It just had to be decent enough to save $$$$ vs a Matlab license. If it can drop-in run the code that has been keeping the lab going for decades, good enough.
MathWorks offers a huge list of "toolboxes", domain specific extensions that cover a lot of features in each domain. Replacing Matlab isn't about the core language alone.
The notebooks were THE thing of Mathematica, at least to me.
12 years ago, as I was finishing my PhD in quantum optics, I wanted to migrate to the stack used in industry - and picked Python. Also, that way I was an early adopter of Jupyter Notebook, as it captured what was need + was open.
Now Mathematica notebooks (still remember, it is .nb) do not have the novelty factor. But they were the first to set a trend, which we now take for granted.
That said, I rarely use notebooks anymore. In the coding time, it is much easier to create scripts and ask to create a visualization in HTML.
Mathematica's notebooks are the only environment where I can do some computation to arrive at a symbolic expression. Copy the expression from the output cell into a new input cell. Then manipulate it by hand into the form I want. Then continue processing it further.
Also, symbolic expressions can be written nicely with actual superscripts and subscripts, and with non-latin characters.
I disagree, the language itself is one of the more elegant parts of the system, and enables a lot of the rest of the elegance.
From a purely programming language theory, it's pretty unique.
I once tried to find a language that had all the same properties, and I failed. The Factor language is probably the closest. But they are still pretty different.
The relevant programming paradigm is string/term rewriting, which is featured in other programming languages such as Pure. It seems to have few direct applications outside of symbolic computing itself, compilers and related fields such as PL theory. (Formal calculi and languages are often specified in PL theory as rewrite rules, even though the practical implementation may ultimately differ.)
First I believe there is no such thing as the Mathematica language, it's Wolframscript which is useful in a bunch of different applications. And second, if you don't have access to a $1000 / yr wolfram subscription, this would be the next best thing.
Hi, I'm the main developer. We're steadily getting closer to the next release which will support most features of Mathematica 1.0 plus some of the most popular newer functions (> 900 overall!). AMA!
There's a mystique around Mathematica's math engine. Is this groundless, or will you eventually run into problems getting correct, identical answers -- especially for answers that Mathematic derives symbolically? The capabilities and results of the computer algebra systems that I've used varied widely.
Hard to tell honestly. So far there was always some surprisingly straight forward solution If had any problems with the math engine. There is actually a lot of public research how equations can be solved/simplified with computer algorithms. So I'm optimistic.
I also stumbled upon a few cases where Mathematica itself didn't quite do things correctly itself (rounding errors, missing simplifications, etc.). So maybe it's actually a little overhyped …
It's a worthwhile effort. If successful, Woxi can enable a large mass of scientists and engineers who don't have access to Mathematica to run legacy code written for it. Also, Woxi would give those scientists and engineers who regularly use Mathematica a non-proprietary, less restrictive alternative, which many of them would welcome.
How does Woxi compare to other "clean-room implementations"[a] of the same language?
--
[a] Please check with a lawyer to make sure you won't run into legal or copyright issues.
Interesting, thanks for sharing. Naive question as I'm not familiar with Mathematica much (but aware of it and Wolfram Alpha and related tools), how does it compare to e.g. Jupyter or Julia or maybe another language (with its framework) that might be even closer?
I think Wolfram Language is just so much more ergonomic. No need to import dependencies - everything's included and consistent, very readable - yet compact - syntax, less gotchas than Python, R, etc., sensible default, …
Ymmv, but I've found that you sure do need to import things eventually, and it's not so ergonomic because most projects just end up as mega-notebooks.
Just like Python or any other language that looks easy for the learning examples, there are still hairy bits, they're just better hidden. The difference is that the debuggers for Python are far better.
Mathematica is great for quick stuff, but once you hit a particular level complexity it goes crazy. In this regard I find it similar to Bash.
Yeah, I've already looked into it, but decided to keep developing it "example driven" for now. Aka I'm playing around with it, and whenever I find something that's broken I keep a note of it and then I pick those notes one by one and implement them. Once the most common things are implemented I will start writing property tests to catch all the edge cases of each feature.
I'm saying you can go even further and automate the entire thing using LLMs/agents, it is pretty much the ideal use case: you have a black-box reference implementation to test against; descriptive documentation for what the functions should do; some explicitly supplied examples in the documentation; and the ability to automatically create an arbitrary number of tests.
So not only do you have a closed loop system that has objective/automatic pass-fail criteria you also don't even have to supply the instructions about what the function is supposed to do or the test cases!
Obviously this isn't going to be 100% reliable (especially for edge cases) but you should be able to get an enormous speed up. And in many cases you should be able to supply the edge case tests and have the LLM fix it.
(Codex is still free for the next few days if you want to try their "High"/"Extra high" thinking models)
This is cool! I've always wanted a polished kernel on the terminal. I spent a lot of time a few years ago writing my own Wolfram Kernel. It was a blast to understand how a pattern matching (symbolic) language is implemented.
Have you considered doing property tests with Mathematica as an oracle?
An ai based development workflow with a concrete oracle works very well. You still need the research and planing to solve things in a scalable way, but it solves the "are the tests correct" issue.
What we've done is pull out failing property tests as a unit tests, makes regression testing during the agentic coding loop much more efficient.
I regularly use Mathematica for working with symbolic expressions (for its DSolve and transfer function stuff) and it is way more maintainable and elegant to have fractions, symbols and powers rendered in math mode instead of having to deal with a text only representation. Are there any front ends (either custom or somehow extending jupyter) for this project which recreate this experience?
For folks who are considering passing, note that there is a "Jupyter Lite" mode in addition to "Woxi Studio" --- seems very promising and the former addressed my first concern out-the-gate.
Such a massive undertaking would be almost impossible without AI agents, so yeah, they help me. But with around 5000 tests, they are actually helping to improve the software quality!
Reviewing the correctness of code is a lot harder than writing correct code, in my experience. Especially when the code given looks correct on an initial glance, and leads you into faulty assumptions you would not have made otherwise.
I'm not claiming AI-written and human-reviewed code is necessarily bad, just that the claim that reviewing code is equivalent to writing it yourself does not match my experience at all.
Plus if you look at the commit cadence there is a lot of commits like 5-10 minutes a part in places that add new functionality (which I realize doesn't mean they were "written" in that time)
I find people do argue a lot about "if it is reviewed it is the same" which might be easy when you start but I think the allure of just glancing going "it makes sense" and hammering on is super high and hard to resist.
We are still early into the use of these tools so perhaps best practices will need to be adjusted with these tools in mind. At the moment it seems to be a bit of a crap shoot to me.
what's stopping some Mathematica employee from taking the source code and having an agent port it. Or even reconstruction from the manual. Who owns an algorithm?
Any patent. The question was who owns a (arbitrary) algorithm. The elaborated answer is that nobody “owns” an algorithm (i.e. has intellectual property rights to it) without a patent: in USA and many other jurisdictions, patents are the IP tool relating to algorithms.
There is an architectural flaw in Woxi that will sink it hard. Looking through the codebase things like polynomials are implemented in the rust code, not in woxilang. This will kill you long term.
The right approach is to have a tiny core interpreter, maybe go to JIT at some point if you can figure that out. Then implement all the functionality in woxilang itself. That means addition and subtraction, calculus, etc are term rewriting rules written in woxilang, not rust code.
This frees you up in the interpreter. Any improvements you make there will immediately show up over the entire language. It's also a better language to implement symbolic math in than rust.
It also means contributors only need to know one language: woxilang. No need to split between rust and woxilang.
Keep the interpreters surface area as small as possible. Do some work to make sure you can accelerate numeric, and JIT/compile functions down to something as close to native as you can.
Wolfram, and Taliesin Beynon have both said Wolfram were working internally to get a JIT working in the interpreter loop. Keep the core small, and do that now while it's easy.
Also, it's just easier to write in Mathematica. It's probably 10x smaller than the rust code:
it's a huuuuuuuuge amount of technology in the standard library of mathematica, beyond the surface syntax and rewrite system, i mean.
SPSS is hilariously painful to use. Still it's only losing ground ever so slowly. PSPP remains almost unheard of among SPSS core users.
Now Mathematica notebooks (still remember, it is .nb) do not have the novelty factor. But they were the first to set a trend, which we now take for granted.
That said, I rarely use notebooks anymore. In the coding time, it is much easier to create scripts and ask to create a visualization in HTML.
Mathematica's notebooks are the only environment where I can do some computation to arrive at a symbolic expression. Copy the expression from the output cell into a new input cell. Then manipulate it by hand into the form I want. Then continue processing it further.
Also, symbolic expressions can be written nicely with actual superscripts and subscripts, and with non-latin characters.
One of the best features of Mathematica system.
From a purely programming language theory, it's pretty unique.
I once tried to find a language that had all the same properties, and I failed. The Factor language is probably the closest. But they are still pretty different.
https://writings.stephenwolfram.com/2013/02/what-should-we-c...
https://rulebasedintegration.org/
It's a worthwhile effort. If successful, Woxi can enable a large mass of scientists and engineers who don't have access to Mathematica to run legacy code written for it. Also, Woxi would give those scientists and engineers who regularly use Mathematica a non-proprietary, less restrictive alternative, which many of them would welcome.
How does Woxi compare to other "clean-room implementations"[a] of the same language?
--
[a] Please check with a lawyer to make sure you won't run into legal or copyright issues.
Just like Python or any other language that looks easy for the learning examples, there are still hairy bits, they're just better hidden. The difference is that the debuggers for Python are far better.
Mathematica is great for quick stuff, but once you hit a particular level complexity it goes crazy. In this regard I find it similar to Bash.
How close is it to being able to run rubi: https://rulebasedintegration.org/?
Here is e.g. all the values for the Plus[] function:
$ wolframscript -code 'WolframLanguageData["Plus", "Ranks"]' {All -> 6, StackExchange -> 8, TypicalNotebookInputs -> 5, TypicalProductionCode -> 6, WolframAlphaCodebase -> 6, WolframDemonstrations -> 4, WolframDocumentation -> 4}
Better license? Allowed for commercial operations?
- Faster startup time because of no license check
- Can run multiple instances of Woxi at the same time
- Embeddable via WASM
- Configurable via compile time flags (which features should be included)
- …
So not only do you have a closed loop system that has objective/automatic pass-fail criteria you also don't even have to supply the instructions about what the function is supposed to do or the test cases!
Obviously this isn't going to be 100% reliable (especially for edge cases) but you should be able to get an enormous speed up. And in many cases you should be able to supply the edge case tests and have the LLM fix it.
(Codex is still free for the next few days if you want to try their "High"/"Extra high" thinking models)
https://github.com/anandijain/cas8.rs
An ai based development workflow with a concrete oracle works very well. You still need the research and planing to solve things in a scalable way, but it solves the "are the tests correct" issue.
What we've done is pull out failing property tests as a unit tests, makes regression testing during the agentic coding loop much more efficient.
I'm not claiming AI-written and human-reviewed code is necessarily bad, just that the claim that reviewing code is equivalent to writing it yourself does not match my experience at all.
I find people do argue a lot about "if it is reviewed it is the same" which might be easy when you start but I think the allure of just glancing going "it makes sense" and hammering on is super high and hard to resist.
We are still early into the use of these tools so perhaps best practices will need to be adjusted with these tools in mind. At the moment it seems to be a bit of a crap shoot to me.
what's stopping some Mathematica employee from taking the source code and having an agent port it. Or even reconstruction from the manual. Who owns an algorithm?
Will everything get copied eventually?
Laws against theft. Also the same reason employees don't release the code on pastebin or something.
> Who owns an algorithm?
The org or person who was granted the software patent. https://en.wikipedia.org/wiki/Software_patent
> Will everything get copied eventually?
If we're lucky. More likely everything bitrots as technical capabilities are lost. Slowly at first, then quickly.