[spam][julia][wrong] Probabilistic Program Synthesis?
The first thing you think when reading about julialang is probably along the lines of, "oh i could totally make a hyperintelligent computer virus that took over the world, with that". Imagine crowds and crowds of engineers thinking this. Well, I websearched for "julia function synthesis", and lo and behold, it's in the standard library. Or maybe they just centralise the docs for the package repository, I'm not sure. https://docs.juliahub.com/Synthesizer/b8Maw/0.1.0/ I'll read this while typing it into this email, so that I might get farther. It sounds cool. Note: It may be productive to focus on speeding the crafting of tests. Additionally, the function generator does not yet consolidate and optimize its result. - This is the documentation for the Synthesizer.jl program synthesis system. [ https://github.com/femtomc/Synthesizer.jl ] The goal of this package is to provide a set of interfaces for probabilistic program synthesis via sketching. he hope is to faciliate research into probabilistic program synthesis, as well as provide a pipeline to optimize synthesized programs as IR fragments, before compilation into a method body in Julia. The idea behind sketching is to provide a high-level skeleton of the program (or function) you want to synthesize. This high-level skeleton contains a number of "holes" which the synthesizer should fill with an expression or function. Let's say we wanted to synthesize a sorting program. function sort!(x::Array{T}) where T <: Number while !check(x) x = hole(array_operation, x) end return x end Here, I've provided the high-level structure of the function (e.g. control flow) but I've also inserted a hole, and I'm telling the synthesizer what sort of hole it should be (an array_operation function, which acts on x). array_operation is defined by a miniature DSL. @lang (array_operation) _1 expr = begin 0.20 => swap!(_1) 0.20 => reorder(_1) 0.20 => swap_head!(_1) 0.20 => swap_tail!(_1) 0.20 => reverse(_1) end This DSL is probabilistic - the numbers on the left-hand side correspond to probability of selection by the synthesizer. Furthermore, we've specified that holes of array_operation take in a single argument (here, by specifying _1 before defining the expr of type array_operation). In our high-level program, we've cheated a bit by providing a way for the program to determine if it's sorted the list correctly. The function check checks if the array satisfies the sorted condition. function check(x::Array{T}) where T <: Number length(x) == 1 && return true i = x[1] for j in x[2 : end] if !(i < j) return false end i = j end return true end To summarize: in Synthesizer.jl, the user writes small probabilistic DSLs (specifically, probabilistic context-free grammars) which can then be used inside holes in other functions. The process of synthesis is expressed using a universal trace-based probabilistic programming system. [ https://github.com/femtomc/Jaynes.jl ] The core of the engine is a multi-threaded rejection sampler. function synthesize(sel::Array{K}, fn::Function, pair::T; iters = 50) where {K <: Jaynes.ConstrainedSelection, T <: Tuple} in, out = pair success = Jaynes.CallSite[] Threads.@threads for s in sel for i in iters ret, cl, w = generate(s, fn, in) out == ret && push!(success, cl) end end return success end function synthesize(fn::Function, pairs::Array{T}; iters = 50) where T <: Tuple local cls constraints = [selection()] for p in pairs cls = synthesize(constraints, fn, p; iters = iters) cls == nothing && return cls constraints = map(cls) do cl get_selection(cl) end end return cls end Here, synthesize requires that the user pass in a function with holes, as well as pairs of (input, output) tuples. generate generates a set of possible solutions. This version of synthesize will return an Array of successful traces (if the search succeeds). The CallSite representation is a trace of the original function, as well as a recording of the choices made by the PCFG. In future versions, Synthesizer.jl will support the ability to compile these traces into IR and generate a new (optimized) method body for the synthesized function.
Here's an earlier one from 2019. It likely uses a different approach. https://github.com/lcary/ProgramSynthesis.jl ProgramSynthesis.jl =================== `ProgramSynthesis.jl` is a library of program synthesis routines and tools. This project aims to introduce a library written in Julia that can be imported and used to automatically generate programs via enumeration, and filter those programs based on their ability to solve given problem sets (or sets of input-output examples). What is Program Synthesis? -------------------------- The term "program synthesis" refers to the automatic construction or synthesizing of programs that match a given specification. Usage ----- Currently, the main use case for `ProgramSynthesis.jl` is to receive a "request" consisting of an array of probems, each having its own set of input-output examples, and produce a "response" dictionary of solutions, where each key is the name of the problem, and each value list is an array of programs that solve the problem. The data format of the request and response is JSON. Example: Example: ``` ❯ julia --project bin/main.jl enumerate test/resources/request_enumeration_example_6.json /Users/lcary/w/mit/ProgramSynthesis.jl/messages/response_enumeration_PID4341_20190805_T104049.json ``` Testing ------- To run the unit tests, clone the repo and run: ``` ./runtests ``` References ---------- This repository implements enumeration algorithms from the following "ec"/"dreamcoder" codebase: https://github.com/ellisk42/ec Program synthesis wiki: https://en.wikipedia.org/wiki/Program_synthesis The representation of solutions relies on Lambda Calculus: https://en.wikipedia.org/wiki/Lambda_calculus The construction of programs in enumeration uses unification: https://en.wikipedia.org/wiki/Unification_(computer_science) Request Formats --------------- Problem request format: ``` { "DSL": { "logVariable": 0.0, "productions": [ { "expression": "map", "logProbability": 0.0 }, ... ] }, "tasks": [ { "examples": [ { "inputs": [[6,1,1,6,3]], "output": 5 }, ... ], "name": "length-test", "request": { "constructor": "->", "arguments": [ {"constructor": "list", "arguments": [{"constructor": "int", "arguments": []}]}, {"constructor": "int", "arguments": []} ] }, "maximumFrontier": 10 } ], "programTimeout": 0.0005, "nc": 1, "timeout": 1.0, "lowerBound": 4.5, "upperBound": 6.0, "budgetIncrement": 1.5, "verbose": false, "shatter": 10 } ``` Solution response format: ``` { "length-test": [ { "program": "(lambda (length $0))", "time": 0.9078459739685059, "logLikelihood": 0.0, "logPrior": -4.795790545596741 } ] } ```
Next I found a paper on solving the schrodinger equation using program synthesis. https://aip.scitation.org/doi/pdf/10.1063/5.0062497 Here's its discussion on program synthesis: Sec. I. Introduction ... Against the backdrop of AI/ML, a fundamentally different approach to computational science is _program_synthesis_ (PS).[38-45] Here, the central idea is to use enumeration, AI, or other optimization strategies to generate complete computer programs that achieve a user-defined task subject to performance critera; in other words, the focus in PS is on the generation of a computer code, rather than the more common tasks of regression or classification encountered in applying AI/ML in computational chemical sciences. The purpose of this article is to explore the application of PS tools to derive novel algorithms to solve a benchmark problem in quantum chemistry, namely, determination of solutions to the time-independent vibrational Schrodinger equation. In the context of quantum chemistry, it is worth highlighting that previous automatic formula derivation methods and PS strategies have found success in the domain of _ab_initio_ electronic structure theory.[46-48] For example, Hirata's Tensor Contraction Engine (TCE)[46,47] takes as input an electronic wavefunction ansatz, such as that given in coupled-cluster schemes, and subsequently generates appropriate tensor contractions and associated instruction sets to enable highly efficient computational implementation of the underlying theory. Here, working in the second quantization formalism, TCE operates by applying well-defined contraction rules to generate efficient formulas for evaluating relevant operator matrix elements and implements the resulting code for these contractions. This scheme provides an impressive demonstration of the power of PS in tackling numerically daunting tasks but differs from the generalized inductive optimization strategy employed in this article, where the focus is not on deployment of rigorous contraction rules to optimize equation implementation but is, instead, aimed at exploring novel algorithms that can solve target problems in previously unknown ways using previously unknown methods. In this article, we consider the application of inductive PS to the problem of generating the code and algorithms that can solve the time-independent vibrational Schrodinger equation. Within this inductive programming paradigm, we assume that we have a set of examples of expected outputs required from our code when executed on a corresponding set of inputs; these correct input/output examples are assumed to be provided by an _oracle_ code that can exactly produce the desired output for any input. Within this inductive paradigm, the problem of PS can be cast as a challenge in optimization; one seeks the set of computer instructions that form a code to produce the expected output given each example input. In this sense, this inductive PS can be viewed as somewhat comparable to the usual training of an artificial neural network (ANN), where one optimizes network weights using a set of input/output examples; the key difference in PS is that the output is not a series of ANN weights but is, instead, a complete computer program to perform the required processing. As such, by focusing on algorithm generation, this approach has the potential to automatically discover novel theories or functional patterns that might not have been otherwise anticipated; investigating whether this strategy can be ported into the realm of computation chemistrty is an important goal of this article. Here, we are interested not in function approximation or fast implementation of defined tensor contractions but in generating complete _codes_ that suggest general routes to solving the time-independent Schrodinger equation (TISE). With the challenge of PS framed as an optimization problem, a range of different approaches are commonly applied to auto-generate a code that satisfies the target program performance. In this context, genetic programming (GP) has been used extensively.[43-45,49,50] Here, a code is typically represented as a tree structure constructed from a set of pre-defined primitive functions (such as *,:,+,-) and constants; arranging this set of functions and constants in a variable tree structure acting on a set of inputs creates a computer program that gives a specific output for each input that is presented. In GP applied to PS, a population of tree codes is evolved under the action of evolutionary operations; the fitness of each population member is given by an appropriate metric, indicating how accurately the tree code yields the expected target outputs, and the principles of Darwinian evolution is used to drive optimization of the population toward better-performing codes. As we describe below, we use a simpler simulated annealing (SA) protocol to perform PS here, in combination with a code representation that is comparable to that employed in Cartesian GP (CGP).[44,51] Our interest here is in exploring challenges and possibilities for PS in the domain of quantum chemistry, and we leave the further development of targeted optimization strategies and code representations to future work. In this article, we investigate the use of a linear PS system to generate solutions of the TISE. In this initial proof-of-concept study, we primarily investigate the extent to which PS enables solution of the one-dimensional (1D) vibrational Schrodinger equation for bound polynomial PESs. While being simple, such systems serve as a useful starting point for which numerically exact solutions can be readily generated to enable the evaluation of PS-generated algorithms. In contrast to schemes used in electronic structure theory (such as TCE[46,47]) or previous GP-based schemes,[50] our strategy does not optimize implementation of a specific target wavefunction ansatz but, instead, uses a search in a space defined by possible instruction sequences, guided by an optimization function assessing code performance. The remainder of this article is organized as follows. First, in Sec. II, we discuss our own implementation of a linear PS system, combined with a SA optimization protocol. In Sec. III, we then show how our PS strategy can readily generate novel algorithms (codes) that solve the Schrodinger equation for arbitrary bound polynomial PESs; for example, by optimizing code structures for a set of just ten input/output examples, we find that the resulting code give accurate ground-state wavefunctions for a further 10^3 random PES examples. We also investigate some aspects of our own PS setup in an attempt to better understand the impact of algorithm parameters on predictive performance. In Sec. V, we discuss current limitations of our PS approach and offer some further discussion as to how the PS strategy might be extended to enable the prediction of properties for far more complex, many-dimensional systems. Finally, Sec. VI summarizes this contribution.
II. THEORY Here, we first describe our target application, namely, determination of the ground-state wavefunctions for one-dimensional Schrödinger equations with bound PESs. We then explain how algorithms are represented in our linear PS approach, before describing how the performance of the computer-generated code is optimized using a SA protocol. We also give details of the input, output, and internal function sets that are used in our implementation to form solutions to Eq. (1). A. Problem representation As a benchmark problem, we will seek to identify codes that can produce the ground-state solution ψ(x) for the 1D time-independent Schrödinger equation, −ℏ22𝑚𝑑2𝜓(𝑥)𝑑𝑥2+𝑉(𝑥)𝜓(𝑥)=𝐸0𝜓(𝑥). [as-pasted] Here, m is the particle mass and E0 is the total energy of the eigenstate ψ(x); in what follows, we adopt atomic units such that ℏ = 1, and we will only consider systems for which m = 1. As emphasized below and in contrast to previous research,50 we are aiming to generate entire computer programs that can solve the eigenproblem of Eq. (1) to give ψ(x) for more general V(x). In all the following discussion, we will aim to generate algorithms for PESs V(x) that are of a general polynomial form given by 𝑉(𝑥)=∑𝑘=1𝑁𝑝𝑎𝑘𝑥𝑘, [as-pasted] ascii transcription: V(x) = Sum[k=1...N_p, a_k * x^k], where Np is the polynomial order and a is a set of random coefficients chosen in the range [−5,+5]. To encourage generation of algorithms that are transferrable across different polynomial orders, the order Np is randomly selected Np ∈ [2, 6] whenever a PES is required as input during our PS strategy. Furthermore, we require that any V(x) is bound such that it is greater than a sufficiently large value [chosen to be V(x) = 5 here] at the boundaries of the coordinate range-of-interest (see below). This setup for V(x) is chosen because, by choosing different random coefficients a, it allows us to generate arbitrary numbers of different V(x) examples that can be used as target inputs to verify the accuracy of codes generated by PS. Based on the above definitions, our goal is to now automatically synthesize algorithms that output ψ(x) for a given V(x). However, to make progress, we must first define how the mathematical problem captured by Eq. (1) will be represented in our computer-generated codes; here, we adopt a computationally simple and practical approach, representing both V(x) and ψ(x) as real-valued numbers on a uniform grid x. The vectors denoting these properties on the grid-points will be labeled V and ψ, respectively; in all calculations reported below, the uniform grid comprises ng = 101 grid-points uniformly distributed on the range 𝑥𝑖∈[−5,+5]. This grid-based approach is well-known in the field of quantum simulations, as in the discrete variable representation (DVR52–55) that forms the basis of common strategies for solving Eq. (1); the connection between DVR methods and the algorithms generated here is discussed further below. Alternative representations of the input V(x) and output ψ(x) can clearly be explored in the future, whereas the focus of the current article is in providing an initial proof-of-concept. B. Program representation Previous work on PS has employed a variety of different structures for representing computer programs, such as tree structures, directed acyclic graphs, and fixed-size grids.[43-45,49,50] In this article, we choose to use a program representation that is analogous to that used in CGP, namely, using a grid of functional nodes that operate on input vectors, constants, and matrices.[44,56,57] Although we anticipate that a tree structure, as commonly employed in GP, can be used to solve the same problem, a grid-like code representation is selected here for its simplicity, flexibility, and ease of interpretation. Of course, there is no guarantee that the chosen program representation used here is optimal in any way, and the impact of the program structure will be a subject of interest in future work. ........................ this doesn't look like synthesis of a computer program to me, more synthesis of an equation. i may be wrong. it looks like it turned up as a result for a websearch on 'julia' because it was published in 'july'. it doens't mention julia.
well i dunno to me procedural programming is still distinct from taking derivatives of equations maybe those two julia libraries are more useful to design source code
MIT's latest probabilistic programming library is called Gen on julia and is ongoing research involving the author of the first library above. https://www.gen.dev/ $ julia julia> ] pkg> add Gen
okay, "probabilistic programming" is totally different from "program synthesis". ha! Hey, this thing looks cool. It comes with a multimedia presentation video too. https://dl.acm.org/doi/10.1145/3485538 APIfix: output-oriented program synthesis for combating breaking changes in libraries Abstract Use of third-party libraries is extremely common in application software. The libraries evolve to accommodate new features or mitigate security vulnerabilities, thereby breaking the Application Programming Interface(API) used by the software. Such breaking changes in the libraries may discourage client code from using the new library versions thereby keeping the application vulnerable and not up-to-date. We propose a novel output-oriented program synthesis algorithm to automate API usage adaptations via program transformation. Our aim is not only to rely on the few example human adaptations of the clients from the old library version to the new library version, since this can lead to over-fitting transformation rules. Instead, we also rely on example usages of the new updated library in clients, which provide valuable context for synthesizing and applying the transformation rules. Our tool APIFix provides an automated mechanism to transform application code using the old library versions to code using the new library versions - thereby achieving automated API usage adaptation to fix the effect of breaking changes. Our evaluation shows that the transformation rules inferred by APIFix achieve 98.7% precision and 91.5% recall. By comparing our approach to state-of-the-art program synthesis approaches, we show that our approach significantly reduces over-fitting while synthesizing transformation rules for API usage adaptations.
aw man all the cutting-edge program synthesis research stuff i get from google is for windows
https://www.cs.cornell.edu/~asampson/blog/minisynth.html Program Synthesis is Possible MAY 9, 2018 Program synthesis is not only a hip session title at programming languages conferences. It’s also a broadly applicable technique that people from many walks of computer-science life can use. But it can seem like magic: automatically generating programs from specifications sounds like it might require a PhD in formal methods. Wisconsin’s Aws Albarghouthi wrote a wonderful primer on synthesis last year that helps demystify the basic techniques with example code. Here, we’ll expand on Aws’s primer and build a tiny but complete-ish synthesis engine from scratch. You can follow along with my Python code or start from an empty buffer. Z3 is Amazing We won’t quite start from scratch—we’ll start with Z3 and its Python bindings. Z3 is a satisfiability modulo theories (SMT) solver, which is like a SAT solver with theories that let you express constraints involving integers, bit vectors, floating point numbers, and what have you. We’ll use Z3’s Python bindings. Let’s try it out:
import z3
To use Z3, we’ll write a logical formula over some variables and then solve them to get a model, which is a valuation of the variables that makes the formula true. Here’s one formula, for example:
formula = (z3.Int('x') / 7 == 6)
The z3.Int call introduces a Z3 variable. Running this line of Python doesn’t actually do any division or equality checking; instead, the Z3 library overloads Python’s / and == operators on its variables to produce a proposition. So formula here is a logical proposition of one free integer variable, x, that says that x / 7 = 6. Let’s solve formula. We’ll use a little function called solve to invoke Z3: def solve(phi): s = z3.Solver() s.add(phi) s.check() return s.model() Z3’s solver interface is much more powerful than what we’re doing here, but this is all we’ll need to get the model for a single problem:
print(solve(formula))
On my machine, I get: [x = 43] which is admittedly a little disappointing, but at least it’s true: using integer division, 43 \/ 7 = 6. Z3 also has a theory of bit vectors, as opposed to unbounded integers, which supports shifting and whatnot:
y = z3.BitVec('y', 8) print(solve(y << 3 == 40))
There are even logical quantifiers:
z = z3.Int('z') n = z3.Int('n') print(solve(z3.ForAll([z], z * n == z)))
Truly, Z3 is amazing. But we haven’t quite synthesized a program yet. Sketching In the Sketch spirit, we’ll start by synthesizing holes to make programs equivalent. Here’s the scenario: you have a slow version of a program you’re happy with; that’s your specification. You can sort of imagine how to write a faster version, but a few of the hard parts elude you. The synthesis engine’s job will be fill in those details so that the two programs are equivalent on every input. Take Aws’s little example: you have the “slow” expression x * 2, and you know that there’s a “faster” version to be had that can be written x << ?? for some value of ??. Let’s ask Z3 what to write there:
x = z3.BitVec('x', 8) slow_expr = x * 2 h = z3.BitVec('h', 8) # The hole, a.k.a. ?? fast_expr = x << h goal = z3.ForAll([x], slow_expr == fast_expr) print(solve(goal))
Nice! We get the model [h = 1], which tells us that the two programs produce the same result for every byte x when we left-shift by 1. That’s (a very simple case of) synthesis: we’ve generated a (subexpression of a) program that meets our specification. Without a proper programming language, however, it doesn’t feel much like generating programs. We’ll fix that next. ............. it's discouraging that the examples so far do not generate code other than variables. still seeming like the first julia library is more accessible. above python library is at https://github.com/sampsyo/minisynth
https://github.com/femtomc/Synthesizer.jl is only 134 lines of source long, and its author is no longer maintaining it. For me to continue here, it could be helpful to learn how to use a more active program synthesis package, to do similar things as the sketches in this repository. I'm guessing that the concern is that if I put the energy in to add compilation to it, the work may simply be lost and unused both by me and everyone else. But maybe I can try anyway!
This might be one for python, which I am actually familiar with, unlike julia. It doesn't quite look like it is designed for ease of prototyping, unsure. https://github.com/randriu/synthesis # PAYNT PAYNT (Probabilistic progrAm sYNThesizer) is a tool for the automated synthesis of probabilistic programs. PAYNT takes a program with holes (a so-called sketch) and a PCTL specification (see below for more information), and outputs a concrete hole assignment that yields a satisfying program, if such an assignment exists. Internally, PAYNT interprets the incomplete probabilistic program as a family of Markov chains and uses state-of-the-art synthesis methods on top of the model checker [Storm](https://github.com/moves-rwth/storm) to identify satisfying realization. PAYNT is implemented in python and uses [Stormpy](https://github.com/moves-rwth/stormpy), python bindings for Storm. This repository contains the source code of PAYNT along with adaptations for Storm and Stormpy, prerequisites for PAYNT. PAYNT is hosted on [github](https://github.com/gargantophob/synthesis).
thinking on how to quickly expand a sketch-based synthesis system (_that can call itself recursively_), to make an agi with human supervision: basically you want to select the part of the problem that is most helpful for the user to help with, and present information to them in a helpful way - ideally while simultaneously trying things on one's own - but alternatively give them an option to say the information is not helpful, and pursue things further, selecting user presentations that produce reduced delay, that the user won't likely reject or will say they could try if really needed [further thoughts misplaced]
participants (1)
-
Undiscussed Horrific Abuse, One Victim of Many