Discussion:
[Fonc] Talk Monday 26th, San Francisco
Anthony Di Franco
2015-10-25 01:25:05 UTC
Permalink
Hi all,
Sorry about the late notice, but it very recently occurred to me that this
would be a good group to invite feedback from on a talk I'm giving this
coming Monday on approximate search techniques for problems in programming
languages. It's my attempt to describe for the programming language
community the broader agenda from this
<http://www.meetup.com/Bay-Area-entrepreneur-in-statistics/events/224830306/>
talk
I gave last month, which
covered, I suppose I could say, the intersection of nondeterminism and
computation in a very broad interdisciplinary way. To help me with this
shift in perspective, I could use some feedback on whether this makes sense
and how I might improve the focus, either before the talk or after. What
I'm trying to move towards is, in one sense, a generalization of constraint
solving to arbitrary recursive relationships potentially involving
uncertainty, so there's a lot of overlap in the applications with VPRI-like
work and goals, such as declarative UI, better, easier parsing, and Hesam
Samimi's PlanB, for example.

Hope to see some of you there or otherwise hear your thoughts. Here's the
info:
http://www.meetup.com/SF-Types-Theorems-and-Programming-Languages/events/226067416/

*Finally Fifth? Searching for answers in an uncertain world.*
Monday, October 26, 2015, 7:00 PM
Mixrank
<http://maps.google.com/maps?f=q&hl=en&q=164+Townsend+St.+%234%2C+San+Francisco%2C+CA%2C+94107%2C+us>,
164
Townsend St. #4, San Francisco, CA

*Most of us are probably familiar with the trajectory software projects
take: quick early progress with few people working on them, which
transitions, as the scope grows, continuously but sharply to a regime where
large numbers of people and large amounts of effort, at the scale of some
of the largest corporations in history, are insufficient even to keep up
with already known problems. Typically, motivated by the prevailing formal
logic background in programming language theory, people turn to software
methodologies with stronger a priori guarantees to mitigate this problem,
such as functional programming with types, but I will propose a different,
though not mutually exclusive, approach, drawing on a control theory and
systems theory background. Motivated by Robert Kowalski's perspective
developed in his "Algorithm = Logic + Control" (1979) I claim that the real
problem is the combinatorial explosion in the number of algorithms required
to enforce a desired set of relationships as that set of relationships
grows in size. The solution is to finally come to grips with
nondeterminism, and the solution to that, in turn, is to use approximate
search techniques that can take advantage of uncertain information,
information feedback, and compression of the search space. This motivates
the design of the "Fifth" software system I'm currently working on. We'll
conclude with a description of work in progress on the Fifth system.*
André van Delft
2015-10-25 23:37:45 UTC
Permalink
Hi Anthony (and others),

Since you mention "declarative UI, better, easier parsing” I would recommend you to take a look at the project SubScript, by Anatoliy Kmetyuk and me. SubScript is a Scala extension based on the Algebra of Communicating Processes (ACP).
ACP is basically an algebraic theory, useful to describe processes in terms of sequences (as multiplications) and choices (as additions).
In terms thereof other compositions may be defined such as parallelism and disruption.

ACP turns out to be a good basis to extends a deterministic programming language with nondeterministic choice, parallelism and dataflow.
I recommend this GUI controller example in SubScript: http://subscript-lang.org/a-simple-gui-application/ <http://subscript-lang.org/a-simple-gui-application/>
There are more examples and papers on that web site.

Ciao,

André van Delft
Post by Anthony Di Franco
Hi all,
Sorry about the late notice, but it very recently occurred to me that this would be a good group to invite feedback from on a talk I'm giving this coming Monday on approximate search techniques for problems in programming languages. It's my attempt to describe for the programming language community the broader agenda from this <http://www.meetup.com/Bay-Area-entrepreneur-in-statistics/events/224830306/> talk http://youtu.be/ZKlOrkfEPn8 I gave last month, which covered, I suppose I could say, the intersection of nondeterminism and computation in a very broad interdisciplinary way. To help me with this shift in perspective, I could use some feedback on whether this makes sense and how I might improve the focus, either before the talk or after. What I'm trying to move towards is, in one sense, a generalization of constraint solving to arbitrary recursive relationships potentially involving uncertainty, so there's a lot of overlap in the applications with VPRI-like work and goals, such as declarative UI, better, easier parsing, and Hesam Samimi's PlanB, for example.
Hope to see some of you there or otherwise hear your thoughts. Here's the info: http://www.meetup.com/SF-Types-Theorems-and-Programming-Languages/events/226067416/ <http://www.meetup.com/SF-Types-Theorems-and-Programming-Languages/events/226067416/>
Finally Fifth? Searching for answers in an uncertain world.
Monday, October 26, 2015, 7:00 PM
Mixrank <http://maps.google.com/maps?f=q&hl=en&q=164+Townsend+St.+%234%2C+San+Francisco%2C+CA%2C+94107%2C+us>, 164 Townsend St. #4, San Francisco, CA
Most of us are probably familiar with the trajectory software projects take: quick early progress with few people working on them, which transitions, as the scope grows, continuously but sharply to a regime where large numbers of people and large amounts of effort, at the scale of some of the largest corporations in history, are insufficient even to keep up with already known problems. Typically, motivated by the prevailing formal logic background in programming language theory, people turn to software methodologies with stronger a priori guarantees to mitigate this problem, such as functional programming with types, but I will propose a different, though not mutually exclusive, approach, drawing on a control theory and systems theory background. Motivated by Robert Kowalski's perspective developed in his "Algorithm = Logic + Control" (1979) I claim that the real problem is the combinatorial explosion in the number of algorithms required to enforce a desired set of relationships as that set of relationships grows in size. The solution is to finally come to grips with nondeterminism, and the solution to that, in turn, is to use approximate search techniques that can take advantage of uncertain information, information feedback, and compression of the search space. This motivates the design of the "Fifth" software system I'm currently working on. We'll conclude with a description of work in progress on the Fifth system.
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org
Anthony Di Franco
2015-10-26 21:21:19 UTC
Permalink
Thanks André. Just had a quick look to see if there was anything I should
sneak into the talk. I will have to have another longer look after. Nick
Chen had a similar in spirit maybe suggestion in the Meetup comments here
<http://www.meetup.com/SF-Types-Theorems-and-Programming-Languages/events/226067416/>.
I think the response might be similar: I am focusing on how to put
evaluation under control of a query planner, and solving the technical
challenges that arise in that. For my purposes, a language that makes it
easy to express nondeterministic choice could be useful for describing the
options available to the planner, andor for writing the planner itself, but
should not otherwise be something the user would use to solve problems
directly. A complication in drawing from existing ways of describing
nondeterministic choice is that I am focusing on adapting the choices to
partial information, especially that gained from the search in progress.
Everything I have seen in the programming languages community for
nondeterminism uses either a boolean description of the possible validity
of a choice, or more recently permits making a probabilistic choice.
Neither are sufficient for the kind of search I am trying to do, where
there is partial information about many choices which do not necessarily
constrain one another's validity as in a probability distribution.
Post by André van Delft
Hi Anthony (and others),
Since you mention "declarative UI, better, easier parsing” I would
recommend you to take a look at the project SubScript, by Anatoliy Kmetyuk
and me. SubScript is a Scala extension based on the Algebra of
Communicating Processes (ACP).
ACP is basically an algebraic theory, useful to describe processes in
terms of sequences (as multiplications) and choices (as additions).
In terms thereof other compositions may be defined such as parallelism and disruption.
ACP turns out to be a good basis to extends a deterministic programming
language with nondeterministic choice, parallelism and dataflow.
http://subscript-lang.org/a-simple-gui-application/
There are more examples and papers on that web site.
Ciao,
André van Delft
Hi all,
Sorry about the late notice, but it very recently occurred to me that
this would be a good group to invite feedback from on a talk I'm giving
this coming Monday on approximate search techniques for problems in
programming languages. It's my attempt to describe for the programming
language community the broader agenda from this
<http://www.meetup.com/Bay-Area-entrepreneur-in-statistics/events/224830306/>
talk http://youtu.be/ZKlOrkfEPn8 I gave last month,
which covered, I suppose I could say, the intersection of nondeterminism
and computation in a very broad interdisciplinary way. To help me with this
shift in perspective, I could use some feedback on whether this makes sense
and how I might improve the focus, either before the talk or after. What
I'm trying to move towards is, in one sense, a generalization of constraint
solving to arbitrary recursive relationships potentially involving
uncertainty, so there's a lot of overlap in the applications with VPRI-like
work and goals, such as declarative UI, better, easier parsing, and Hesam
Samimi's PlanB, for example.
Hope to see some of you there or otherwise hear your thoughts. Here's the
http://www.meetup.com/SF-Types-Theorems-and-Programming-Languages/events/226067416/
*Finally Fifth? Searching for answers in an uncertain world.*
Monday, October 26, 2015, 7:00 PM
Mixrank
<http://maps.google.com/maps?f=q&hl=en&q=164+Townsend+St.+%234%2C+San+Francisco%2C+CA%2C+94107%2C+us>, 164
Townsend St. #4, San Francisco, CA
*Most of us are probably familiar with the trajectory software projects
take: quick early progress with few people working on them, which
transitions, as the scope grows, continuously but sharply to a regime where
large numbers of people and large amounts of effort, at the scale of some
of the largest corporations in history, are insufficient even to keep up
with already known problems. Typically, motivated by the prevailing formal
logic background in programming language theory, people turn to software
methodologies with stronger a priori guarantees to mitigate this problem,
such as functional programming with types, but I will propose a different,
though not mutually exclusive, approach, drawing on a control theory and
systems theory background. Motivated by Robert Kowalski's perspective
developed in his "Algorithm = Logic + Control" (1979) I claim that the real
problem is the combinatorial explosion in the number of algorithms required
to enforce a desired set of relationships as that set of relationships
grows in size. The solution is to finally come to grips with
nondeterminism, and the solution to that, in turn, is to use approximate
search techniques that can take advantage of uncertain information,
information feedback, and compression of the search space. This motivates
the design of the "Fifth" software system I'm currently working on. We'll
conclude with a description of work in progress on the Fifth system.*
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org
Brian Rice
2015-10-26 21:32:28 UTC
Permalink
As an analytics/visualization/database professional with a long interest in
PL/systems research, I am looking forward to this tonight, thank you.
Post by Anthony Di Franco
Thanks André. Just had a quick look to see if there was anything I should
sneak into the talk. I will have to have another longer look after. Nick
Chen had a similar in spirit maybe suggestion in the Meetup comments here
<http://www.meetup.com/SF-Types-Theorems-and-Programming-Languages/events/226067416/>.
I think the response might be similar: I am focusing on how to put
evaluation under control of a query planner, and solving the technical
challenges that arise in that. For my purposes, a language that makes it
easy to express nondeterministic choice could be useful for describing the
options available to the planner, andor for writing the planner itself, but
should not otherwise be something the user would use to solve problems
directly. A complication in drawing from existing ways of describing
nondeterministic choice is that I am focusing on adapting the choices to
partial information, especially that gained from the search in progress.
Everything I have seen in the programming languages community for
nondeterminism uses either a boolean description of the possible validity
of a choice, or more recently permits making a probabilistic choice.
Neither are sufficient for the kind of search I am trying to do, where
there is partial information about many choices which do not necessarily
constrain one another's validity as in a probability distribution.
Post by André van Delft
Hi Anthony (and others),
Since you mention "declarative UI, better, easier parsing” I would
recommend you to take a look at the project SubScript, by Anatoliy Kmetyuk
and me. SubScript is a Scala extension based on the Algebra of
Communicating Processes (ACP).
ACP is basically an algebraic theory, useful to describe processes in
terms of sequences (as multiplications) and choices (as additions).
In terms thereof other compositions may be defined such as parallelism and disruption.
ACP turns out to be a good basis to extends a deterministic programming
language with nondeterministic choice, parallelism and dataflow.
http://subscript-lang.org/a-simple-gui-application/
There are more examples and papers on that web site.
Ciao,
André van Delft
Hi all,
Sorry about the late notice, but it very recently occurred to me that
this would be a good group to invite feedback from on a talk I'm giving
this coming Monday on approximate search techniques for problems in
programming languages. It's my attempt to describe for the programming
language community the broader agenda from this
<http://www.meetup.com/Bay-Area-entrepreneur-in-statistics/events/224830306/>
talk http://youtu.be/ZKlOrkfEPn8 I gave last month,
which covered, I suppose I could say, the intersection of nondeterminism
and computation in a very broad interdisciplinary way. To help me with this
shift in perspective, I could use some feedback on whether this makes sense
and how I might improve the focus, either before the talk or after. What
I'm trying to move towards is, in one sense, a generalization of constraint
solving to arbitrary recursive relationships potentially involving
uncertainty, so there's a lot of overlap in the applications with VPRI-like
work and goals, such as declarative UI, better, easier parsing, and Hesam
Samimi's PlanB, for example.
Hope to see some of you there or otherwise hear your thoughts. Here's the
http://www.meetup.com/SF-Types-Theorems-and-Programming-Languages/events/226067416/
*Finally Fifth? Searching for answers in an uncertain world.*
Monday, October 26, 2015, 7:00 PM
Mixrank
<http://maps.google.com/maps?f=q&hl=en&q=164+Townsend+St.+%234%2C+San+Francisco%2C+CA%2C+94107%2C+us>, 164
Townsend St. #4, San Francisco, CA
*Most of us are probably familiar with the trajectory software projects
take: quick early progress with few people working on them, which
transitions, as the scope grows, continuously but sharply to a regime where
large numbers of people and large amounts of effort, at the scale of some
of the largest corporations in history, are insufficient even to keep up
with already known problems. Typically, motivated by the prevailing formal
logic background in programming language theory, people turn to software
methodologies with stronger a priori guarantees to mitigate this problem,
such as functional programming with types, but I will propose a different,
though not mutually exclusive, approach, drawing on a control theory and
systems theory background. Motivated by Robert Kowalski's perspective
developed in his "Algorithm = Logic + Control" (1979) I claim that the real
problem is the combinatorial explosion in the number of algorithms required
to enforce a desired set of relationships as that set of relationships
grows in size. The solution is to finally come to grips with
nondeterminism, and the solution to that, in turn, is to use approximate
search techniques that can take advantage of uncertain information,
information feedback, and compression of the search space. This motivates
the design of the "Fifth" software system I'm currently working on. We'll
conclude with a description of work in progress on the Fifth system.*
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org
Anthony Di Franco
2015-10-27 00:36:19 UTC
Permalink
Great, looking forward to meeting you.
Anthony
Post by Brian Rice
As an analytics/visualization/database professional with a long interest
in PL/systems research, I am looking forward to this tonight, thank you.
Post by Anthony Di Franco
Thanks André. Just had a quick look to see if there was anything I should
sneak into the talk. I will have to have another longer look after. Nick
Chen had a similar in spirit maybe suggestion in the Meetup comments here
<http://www.meetup.com/SF-Types-Theorems-and-Programming-Languages/events/226067416/>.
I think the response might be similar: I am focusing on how to put
evaluation under control of a query planner, and solving the technical
challenges that arise in that. For my purposes, a language that makes it
easy to express nondeterministic choice could be useful for describing the
options available to the planner, andor for writing the planner itself, but
should not otherwise be something the user would use to solve problems
directly. A complication in drawing from existing ways of describing
nondeterministic choice is that I am focusing on adapting the choices to
partial information, especially that gained from the search in progress.
Everything I have seen in the programming languages community for
nondeterminism uses either a boolean description of the possible validity
of a choice, or more recently permits making a probabilistic choice.
Neither are sufficient for the kind of search I am trying to do, where
there is partial information about many choices which do not necessarily
constrain one another's validity as in a probability distribution.
Post by André van Delft
Hi Anthony (and others),
Since you mention "declarative UI, better, easier parsing” I would
recommend you to take a look at the project SubScript, by Anatoliy Kmetyuk
and me. SubScript is a Scala extension based on the Algebra of
Communicating Processes (ACP).
ACP is basically an algebraic theory, useful to describe processes in
terms of sequences (as multiplications) and choices (as additions).
In terms thereof other compositions may be defined such as parallelism and disruption.
ACP turns out to be a good basis to extends a deterministic programming
language with nondeterministic choice, parallelism and dataflow.
http://subscript-lang.org/a-simple-gui-application/
There are more examples and papers on that web site.
Ciao,
André van Delft
Hi all,
Sorry about the late notice, but it very recently occurred to me that
this would be a good group to invite feedback from on a talk I'm giving
this coming Monday on approximate search techniques for problems in
programming languages. It's my attempt to describe for the programming
language community the broader agenda from this
<http://www.meetup.com/Bay-Area-entrepreneur-in-statistics/events/224830306/>
talk http://youtu.be/ZKlOrkfEPn8 I gave last month,
which covered, I suppose I could say, the intersection of nondeterminism
and computation in a very broad interdisciplinary way. To help me with this
shift in perspective, I could use some feedback on whether this makes sense
and how I might improve the focus, either before the talk or after. What
I'm trying to move towards is, in one sense, a generalization of constraint
solving to arbitrary recursive relationships potentially involving
uncertainty, so there's a lot of overlap in the applications with VPRI-like
work and goals, such as declarative UI, better, easier parsing, and Hesam
Samimi's PlanB, for example.
Hope to see some of you there or otherwise hear your thoughts. Here's
http://www.meetup.com/SF-Types-Theorems-and-Programming-Languages/events/226067416/
*Finally Fifth? Searching for answers in an uncertain world.*
Monday, October 26, 2015, 7:00 PM
Mixrank
<http://maps.google.com/maps?f=q&hl=en&q=164+Townsend+St.+%234%2C+San+Francisco%2C+CA%2C+94107%2C+us>, 164
Townsend St. #4, San Francisco, CA
*Most of us are probably familiar with the trajectory software projects
take: quick early progress with few people working on them, which
transitions, as the scope grows, continuously but sharply to a regime where
large numbers of people and large amounts of effort, at the scale of some
of the largest corporations in history, are insufficient even to keep up
with already known problems. Typically, motivated by the prevailing formal
logic background in programming language theory, people turn to software
methodologies with stronger a priori guarantees to mitigate this problem,
such as functional programming with types, but I will propose a different,
though not mutually exclusive, approach, drawing on a control theory and
systems theory background. Motivated by Robert Kowalski's perspective
developed in his "Algorithm = Logic + Control" (1979) I claim that the real
problem is the combinatorial explosion in the number of algorithms required
to enforce a desired set of relationships as that set of relationships
grows in size. The solution is to finally come to grips with
nondeterminism, and the solution to that, in turn, is to use approximate
search techniques that can take advantage of uncertain information,
information feedback, and compression of the search space. This motivates
the design of the "Fifth" software system I'm currently working on. We'll
conclude with a description of work in progress on the Fifth system.*
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org
John Carlson
2015-11-05 02:02:21 UTC
Permalink
I am familiar with Grammex, but I want wondering if there’s been any more effort to create syntax by incremental example, given that validation tools are present (online, either getting feedback from a compiler, or ideally from an IDE with incremental compiling and code completion). I think from reading about reverse engineering protocols that this might be polynomial time, whereas, without a validator (offline), it’s NP-complete? Or is that only for regular grammars? What about “syntax-free” languages, like machine code? Is this where the idea that it’s NP-complete came from? I’m actually interested in creating JSON schemas from thousands of example JSON files. I want something more detailed than the JSON specification, with domain validation.
John
David Barbour
2015-11-05 03:36:11 UTC
Permalink
You could probably repurpose a recurrent neural network to compute a likely
schema in polynomial time.
I am familiar with Grammex, but I want wondering if there’s been any more
effort to create syntax by incremental example, given that validation tools
are present (online, either getting feedback from a compiler, or ideally
from an IDE with incremental compiling and code completion). I think from
reading about reverse engineering protocols that this might be polynomial
time, whereas, without a validator (offline), it’s NP-complete? Or is that
only for regular grammars? What about “syntax-free” languages, like
machine code? Is this where the idea that it’s NP-complete came from? I’m
actually interested in creating JSON schemas from thousands of example JSON
files. I want something more detailed than the JSON specification, with
domain validation.
John
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org
John Carlson
2015-11-05 04:22:18 UTC
Permalink
Sorry, I said that wrong. I am trying to generate a single JSON schema from thousands of JSON files. I don’t want a different JSON schema per JSON document. I think that might make a difference as to what technology to use.

Thanks,

John
You could probably repurpose a recurrent neural network to compute a likely schema in polynomial time.
I am familiar with Grammex, but I want wondering if there’s been any more effort to create syntax by incremental example, given that validation tools are present (online, either getting feedback from a compiler, or ideally from an IDE with incremental compiling and code completion). I think from reading about reverse engineering protocols that this might be polynomial time, whereas, without a validator (offline), it’s NP-complete? Or is that only for regular grammars? What about “syntax-free” languages, like machine code? Is this where the idea that it’s NP-complete came from? I’m actually interested in creating JSON schemas from thousands of example JSON files. I want something more detailed than the JSON specification, with domain validation.
John
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org <http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org>
Matthew Retchin
2015-11-05 04:27:35 UTC
Permalink
If you have a ton of examples to train on, that makes a recurrent neural
network even more attractive as (part of) a solution, doesn't it?
Post by John Carlson
Sorry, I said that wrong. I am trying to generate a single JSON schema
from thousands of JSON files. I don’t want a different JSON schema per
JSON document. I think that might make a difference as to what technology
to use.
Thanks,
John
You could probably repurpose a recurrent neural network to compute a
likely schema in polynomial time.
I am familiar with Grammex, but I want wondering if there’s been any more
effort to create syntax by incremental example, given that validation tools
are present (online, either getting feedback from a compiler, or ideally
from an IDE with incremental compiling and code completion). I think from
reading about reverse engineering protocols that this might be polynomial
time, whereas, without a validator (offline), it’s NP-complete? Or is that
only for regular grammars? What about “syntax-free” languages, like
machine code? Is this where the idea that it’s NP-complete came from? I’m
actually interested in creating JSON schemas from thousands of example JSON
files. I want something more detailed than the JSON specification, with
domain validation.
John
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org
John Carlson
2015-11-05 05:30:09 UTC
Permalink
I guess I’m not clear on what the inputs and outputs of the RNN would be. Would I send in various JSON documents with a YES/NO flag? Or would I send in each JSON document with incrementally updated schema? Obviously I would get the schema as output. How would I determine an error for back propagation? Sorry, my knowledge of RNNs is limited. I am trying to get a library, but so far I see netCDF and I have to convert my JSON documents to netCDF and download a bunch of packages. Also, my JSON documents are nested and hierarchical in nature.

John
If you have a ton of examples to train on, that makes a recurrent neural network even more attractive as (part of) a solution, doesn't it?
Sorry, I said that wrong. I am trying to generate a single JSON schema from thousands of JSON files. I don’t want a different JSON schema per JSON document. I think that might make a difference as to what technology to use.
Thanks,
John
You could probably repurpose a recurrent neural network to compute a likely schema in polynomial time.
I am familiar with Grammex, but I want wondering if there’s been any more effort to create syntax by incremental example, given that validation tools are present (online, either getting feedback from a compiler, or ideally from an IDE with incremental compiling and code completion). I think from reading about reverse engineering protocols that this might be polynomial time, whereas, without a validator (offline), it’s NP-complete? Or is that only for regular grammars? What about “syntax-free” languages, like machine code? Is this where the idea that it’s NP-complete came from? I’m actually interested in creating JSON schemas from thousands of example JSON files. I want something more detailed than the JSON specification, with domain validation.
John
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org <http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org>
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org <http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org>
Kevin Jones
2015-11-05 04:34:30 UTC
Permalink
You might consider the following (from the dawn of XML era):http://ceur-ws.org/Vol-45/04-chidlovskii.pdf
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.5668&rep=rep1&type=pdf
|   |
|   | |   |   |   |   |   |
| CiteSeerX — Document Not FoundNo document with DOI "10.1.1.4.5668" The supplied document identifier does not match any document in our repository. |
| |
| View on citeseerx.ist.psu.edu | Preview by Yahoo |
| |
|   |


Kevin Jones



On Wednesday, November 4, 2015 8:23 PM, John Carlson <***@gmail.com> wrote:


Sorry, I said that wrong.  I am trying to generate a single JSON schema from thousands of JSON files. I don’t  want a different JSON schema per JSON document.  I think that might make a difference as to what technology to use.
Thanks,
John

On Nov 4, 2015, at 9:36 PM, David Barbour <***@gmail.com> wrote:
You could probably repurpose a recurrent neural network to compute a likely schema in polynomial time.
On Wed, Nov 4, 2015, 8:03 PM John Carlson <***@gmail.com> wrote:

I am familiar with Grammex, but I want wondering if there’s been any more effort to create syntax by incremental example, given that validation tools are present (online, either getting feedback from a compiler, or ideally from an IDE with incremental compiling and code completion).  I think from reading about reverse engineering protocols that this might be polynomial time, whereas, without a validator (offline), it’s NP-complete?  Or is that only for regular grammars?  What about “syntax-free” languages, like machine code? Is this where the idea that it’s NP-complete came from? I’m actually interested in creating JSON schemas from thousands of example JSON files.  I want something more detailed than the JSON specification, with domain validation.
John
John Carlson
2015-11-05 05:34:48 UTC
Permalink
We already have XML Schema, and JSON schema derived from it with Jsonix Schema compiler. I am researching other methods because currently, Jsonix doesn’t support regular expressions (yes we’re working on it). I’ve tried jskemator and GenJSON. I am sort of considering going up a level, to MOWL and MPEG-7 and then back down to JSON Schema. I’m just not sure I can those specifications yet.

John
http://ceur-ws.org/Vol-45/04-chidlovskii.pdf <http://ceur-ws.org/Vol-45/04-chidlovskii.pdf>
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.5668&rep=rep1&type=pdf <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.5668&rep=rep1&type=pdf>
<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.5668&rep=rep1&type=pdf>
CiteSeerX — Document Not Found
<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.5668&rep=rep1&type=pdf>No document with DOI "10.1.1.4.5668" The supplied document identifier does not match any document in our repository.
View on citeseerx.ist.psu.edu <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.5668&rep=rep1&type=pdf>
Preview by Yahoo
Kevin Jones
Sorry, I said that wrong. I am trying to generate a single JSON schema from thousands of JSON files. I don’t want a different JSON schema per JSON document. I think that might make a difference as to what technology to use.
Thanks,
John
You could probably repurpose a recurrent neural network to compute a likely schema in polynomial time.
I am familiar with Grammex, but I want wondering if there’s been any more effort to create syntax by incremental example, given that validation tools are present (online, either getting feedback from a compiler, or ideally from an IDE with incremental compiling and code completion). I think from reading about reverse engineering protocols that this might be polynomial time, whereas, without a validator (offline), it’s NP-complete? Or is that only for regular grammars? What about “syntax-free” languages, like machine code? Is this where the idea that it’s NP-complete came from? I’m actually interested in creating JSON schemas from thousands of example JSON files. I want something more detailed than the JSON specification, with domain validation.
John
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org <http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org>
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org <http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org>
John Carlson
2015-11-05 05:51:53 UTC
Permalink
Looks like MPEG-7 and MOWL are multimedia (audio and video). I think MPEG includes VRML/X3D, but I’m not sure how MPEG-7 references it.

John
We already have XML Schema, and JSON schema derived from it with Jsonix Schema compiler. I am researching other methods because currently, Jsonix doesn’t support regular expressions (yes we’re working on it). I’ve tried jskemator and GenJSON. I am sort of considering going up a level, to MOWL and MPEG-7 and then back down to JSON Schema. I’m just not sure I can those see those specifications yet.
John
http://ceur-ws.org/Vol-45/04-chidlovskii.pdf <http://ceur-ws.org/Vol-45/04-chidlovskii.pdf>
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.5668&rep=rep1&type=pdf <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.5668&rep=rep1&type=pdf>
<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.5668&rep=rep1&type=pdf>
CiteSeerX — Document Not Found
<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.5668&rep=rep1&type=pdf>No document with DOI "10.1.1.4.5668" The supplied document identifier does not match any document in our repository.
View on citeseerx.ist.psu.edu <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.5668&rep=rep1&type=pdf>
Preview by Yahoo
Kevin Jones
Sorry, I said that wrong. I am trying to generate a single JSON schema from thousands of JSON files. I don’t want a different JSON schema per JSON document. I think that might make a difference as to what technology to use.
Thanks,
John
You could probably repurpose a recurrent neural network to compute a likely schema in polynomial time.
I am familiar with Grammex, but I want wondering if there’s been any more effort to create syntax by incremental example, given that validation tools are present (online, either getting feedback from a compiler, or ideally from an IDE with incremental compiling and code completion). I think from reading about reverse engineering protocols that this might be polynomial time, whereas, without a validator (offline), it’s NP-complete? Or is that only for regular grammars? What about “syntax-free” languages, like machine code? Is this where the idea that it’s NP-complete came from? I’m actually interested in creating JSON schemas from thousands of example JSON files. I want something more detailed than the JSON specification, with domain validation.
John
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org <http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org>
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org <http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org>
John Carlson
2015-11-05 18:10:22 UTC
Permalink
Hmm. I will have to convert these to courier for reading. Do you know if they an handle nested <div>s, <Transform>s (X3D) and <g> (SVG)

John
http://ceur-ws.org/Vol-45/04-chidlovskii.pdf <http://ceur-ws.org/Vol-45/04-chidlovskii.pdf>
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.5668&rep=rep1&type=pdf <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.5668&rep=rep1&type=pdf>
<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.5668&rep=rep1&type=pdf>
CiteSeerX — Document Not Found
<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.5668&rep=rep1&type=pdf>No document with DOI "10.1.1.4.5668" The supplied document identifier does not match any document in our repository.
View on citeseerx.ist.psu.edu <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.5668&rep=rep1&type=pdf>
Preview by Yahoo
Kevin Jones
Sorry, I said that wrong. I am trying to generate a single JSON schema from thousands of JSON files. I don’t want a different JSON schema per JSON document. I think that might make a difference as to what technology to use.
Thanks,
John
You could probably repurpose a recurrent neural network to compute a likely schema in polynomial time.
I am familiar with Grammex, but I want wondering if there’s been any more effort to create syntax by incremental example, given that validation tools are present (online, either getting feedback from a compiler, or ideally from an IDE with incremental compiling and code completion). I think from reading about reverse engineering protocols that this might be polynomial time, whereas, without a validator (offline), it’s NP-complete? Or is that only for regular grammars? What about “syntax-free” languages, like machine code? Is this where the idea that it’s NP-complete came from? I’m actually interested in creating JSON schemas from thousands of example JSON files. I want something more detailed than the JSON specification, with domain validation.
John
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org <http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org>
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org <http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org>
John Carlson
2015-11-05 21:51:35 UTC
Permalink
Okay, I didn’t realize that people were that interested. We are translating VRML/X3D examples to JSON which contain nested transforms and we don’t have ways of validating the JSON beyond jslint, json parse, and jsonlint or converting to back to XML. I would like to add higher level validation like what exists for XML, Schemas and Schematron (rules). JSON schema seems like the next step among many. Ultimately, we want to validate JSON which did not come from XML, but is translatable to XML, we hope. Right now, the regular expressions (number patterns, enums) found in XML Schema are not being converted to JSON schema (and the regular expression languages may be different) with the Jsonix schema compiler. The Jsonix schema compiler (JAXB and Jackson I think) so far seems to be the best way to create JSON schema, but we haven’t tried to verify our JSON standard examples against it. Our JSON, generated with XSLT, contains special prefixes (@,#,-) for identifying attributes, comments and arrays. We have converted our XML to JSON with Jsonix and validated that. We’re also looking into converting XML Schema -> OWL -> JSON schema. Since we have prefixes not in the original schemas, an instances to schema approach seems appropriate, or we will have to hand modify the JSON Schema produced by some tool based on XML to add prefixes. So we are looking for a good incremental or batch tool that works on JSON and JSON schema.

For the recursive part, I believe that Group elements and Transform elements can be co-nested in any combination. Then there are elements below that and above that. There are likely other areas of the spec that are like that, but those are the main two that would handle a lot of cases.

I am extending this problem out more generally, such as automated or incremental schema/ontology/model/database development to see if there’s some correspondence I can leverage. I am talking to the Model Transformation By Demonstration folks about their process and if it can handle recursion, or somehow make their process recursive or stack driven.

XML Schema is here: http://www.web3d.org/specifications/x3d-3.4.xsd XML Schematron is here: http://www.web3d.org/x3d/tools/schematron/X3dSchematronValidityChecks.sch <http://www.web3d.org/x3d/tools/schematron/X3dSchematronValidityChecks.sch> and here: http://www.web3d.org/x3d/tools/schematron/X3dSchematronValidityChecks2.sch <http://www.web3d.org/x3d/tools/schematron/X3dSchematronValidityChecks2.sch>

I’d like to see a JSON version of all three of these. JSON documents in a zip: https://github.com/coderextreme/x3djson/blob/master/X3DJSON.zip?raw=true

We cannot use any software you develop unless it is royalty-free. Pointer to papers are welcome, but I have trouble reading PDFs. Email/Conversation is better for me. If you like, I cat set up a Google Hangout or Skype session to discuss.

John
It's not yet clear to me from the discussion what exactly your problem is. In general, identifying recursive structure from data is a hard problem, and an underdetermined one. It's something I think I've got an approach to in the general case, but nothing you'd want to try if you are actually solving a simpler problem.
If you want to do this in a practical setting, you will probably need to bring as much a priori constraint as you know of into the technique you develop.
Can you give a bit more background on the problem you're trying to solve so we can try to see a correspondence to some formal class of problems and its solution techniques? You mention regular expressions are involved somehow. Can you say what the context of this is?
I am familiar with Grammex, but I want wondering if there’s been any more effort to create syntax by incremental example, given that validation tools are present (online, either getting feedback from a compiler, or ideally from an IDE with incremental compiling and code completion). I think from reading about reverse engineering protocols that this might be polynomial time, whereas, without a validator (offline), it’s NP-complete? Or is that only for regular grammars? What about “syntax-free” languages, like machine code? Is this where the idea that it’s NP-complete came from? I’m actually interested in creating JSON schemas from thousands of example JSON files. I want something more detailed than the JSON specification, with domain validation.
John
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org <http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org>
John Carlson
2015-11-06 20:30:45 UTC
Permalink
I think my problem now boils down to generating a JSON schema from several Java classes. I was able to generate Java classes (I think) from concatenating all my JSON files together and running them through jsonschema2pojo. There were some “duplicate” classes, but it looks manageable. 30 classes were generated and 13 were duplicates, so there are a total of 17 classes (this number seems small). I will have to sort through the classes before generating the schema. I am still looking for a tool that will generate the schema, but it looks like I can add multiple classes to the schema generating tools, but I haven't tried or seen it done yet.
It's not yet clear to me from the discussion what exactly your problem is. In general, identifying recursive structure from data is a hard problem, and an underdetermined one. It's something I think I've got an approach to in the general case, but nothing you'd want to try if you are actually solving a simpler problem.
If you want to do this in a practical setting, you will probably need to bring as much a priori constraint as you know of into the technique you develop.
Can you give a bit more background on the problem you're trying to solve so we can try to see a correspondence to some formal class of problems and its solution techniques? You mention regular expressions are involved somehow. Can you say what the context of this is?
I am familiar with Grammex, but I want wondering if there’s been any more effort to create syntax by incremental example, given that validation tools are present (online, either getting feedback from a compiler, or ideally from an IDE with incremental compiling and code completion). I think from reading about reverse engineering protocols that this might be polynomial time, whereas, without a validator (offline), it’s NP-complete? Or is that only for regular grammars? What about “syntax-free” languages, like machine code? Is this where the idea that it’s NP-complete came from? I’m actually interested in creating JSON schemas from thousands of example JSON files. I want something more detailed than the JSON specification, with domain validation.
John
_______________________________________________
Fonc mailing list
http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org <http://mailman.vpri.org/mailman/listinfo/fonc_mailman.vpri.org>
Continue reading on narkive:
Loading...