Google
 
Webnews.only-4-geeks.com
Interesting places
news.only-4-geeks.com Forum Index » Programming

Speculation about Dijkstra's primitives and how small each f

 
Jump to:  
 
Robert Maas, http://tinyu
PostPosted: Fri Aug 15, 2008 7:23 am    Post subject: Speculation about Dijkstra's primitives and how small each f
       
Dijkstra took function calls (including passing parameters to
called function and collecting return values) for granted, treating
each as an atomic action, and then said that a newly defined
function must combine these atomic actions in only three ways:
- Sequence (which can involve dataflow: return values from early
calls feed into parameters to later calls)
- Decision (IF/THEN or CASE/SELECT/COND)
- Repeating loop with test for exit (WHILE or FOR or MAP; i.e. LOOP in CL)
This restriction formed the basis of structured programming and
functional programming (except that in functional programming
a fourth type of combiner is allowed: LAMBDA/CURRY).

Extreme advocates for the language 'Forth' decided that each 'word'
(corresponds to 'function' or 'procedure' or 'subroutine' or
'method' in most other languages)
must combine atomic actions in just one (1) of those three ways. If
you have something that uses two or more of Dijkstra's combiners,
then the function/word ought to be split into smaller functions/words
each of which uses just one (1) of Dijkstra's combiners.

But tonight I got a new idea: In a "smart system", there are
several "facts" which are "known" by the software and used within
the code that comprises the system. Perhaps there should be a
Dijkstra/Forth style restriction that each function must know at
most one fact. If you have a function that knows two (or more)
unrelated facts, that function ought to be split into separate
functions each of which knows just one of those facts. Furthermore
each fact must be known by just one function.

Note that if you have a dataset that contains a large number of
specific facts all of the same type, accessed in a uniform manner,
that entire set should be treated like a table in a relational
database: Each access method for that table/dataset should be a
single function. So a single access method for a whole set of
related facts in a RDBS or similar dataset counts as just one
"fact" per my proposed restriction.

What do y'all think of my idea?
 

 
Calum Grant
PostPosted: Mon Aug 18, 2008 3:11 pm    Post subject: Re: Speculation about Dijkstra's primitives and how small ea
       
Robert Maas, LINK wrote:
Quote:
Dijkstra took function calls (including passing parameters to
called function and collecting return values) for granted, treating
each as an atomic action, and then said that a newly defined
function must combine these atomic actions in only three ways:
- Sequence (which can involve dataflow: return values from early
calls feed into parameters to later calls)
- Decision (IF/THEN or CASE/SELECT/COND)
- Repeating loop with test for exit (WHILE or FOR or MAP; i.e. LOOP in CL)
This restriction formed the basis of structured programming and
functional programming (except that in functional programming
a fourth type of combiner is allowed: LAMBDA/CURRY).

Extreme advocates for the language 'Forth' decided that each 'word'
(corresponds to 'function' or 'procedure' or 'subroutine' or
'method' in most other languages)
must combine atomic actions in just one (1) of those three ways. If
you have something that uses two or more of Dijkstra's combiners,
then the function/word ought to be split into smaller functions/words
each of which uses just one (1) of Dijkstra's combiners.

But tonight I got a new idea: In a "smart system", there are
several "facts" which are "known" by the software and used within
the code that comprises the system. Perhaps there should be a
Dijkstra/Forth style restriction that each function must know at
most one fact. If you have a function that knows two (or more)
unrelated facts, that function ought to be split into separate
functions each of which knows just one of those facts. Furthermore
each fact must be known by just one function.

Note that if you have a dataset that contains a large number of
specific facts all of the same type, accessed in a uniform manner,
that entire set should be treated like a table in a relational
database: Each access method for that table/dataset should be a
single function. So a single access method for a whole set of
related facts in a RDBS or similar dataset counts as just one
"fact" per my proposed restriction.

What do y'all think of my idea?

Well if you are interested in managing "facts" then Prolog is a great
place to start. One of my favourite languages.

As for restricting the responsibilities of functions - in OOP there's a
principle of single responsibility, so each class/function has one
responsibility. E.g. if you have a class "egg_and_beans" then that's
two responsibilities and the class should be split into two. However if
you have a class "breakfast" then that's coherent, and can again be one
class.

The comment on classes could equally apply to functions. They should
have one clear responsibility to avoid confusion and maximise reuse.

I've philosophised about this "paradox" that classes can do more yet
have fewer responsibilities. I think it's because the level of
abstraction has been raised.

What constitutes a "fact"? You might need a hierarchical concept of
facts, which matches your levels of abstraction. But that's just
structured programming. The flip of decomposition is composition. You
might also need to combine facts in different ways, and you're back to
logic programming which can express such things nicely.

I'd draw you away from the idea of "facts", since unless you are doing
logic programming, you are really interested in the "purpose" of a
component, and functions do more than manage facts.

As for each function knowing one "fact", there's a related principle
called the "Law of Demeter" (LoD). As a guiding principle it's very
useful, but is probably too much work to observe strictly.

LoD states that functions should not make assumptions about the
internals of the objects it uses. It forbids code like
toaster.handle.depress() (since you shouldn't probe the specific
features of the toaster), and prefers code like toaster.toast(bread),
thereby pushing the implementation into the object providing the
functionality. Actually that makes sense, since a future toaster could
use a laser-beam to toast, in which case is doesn't have a handle, but
could provide a toast() method. LoD is designed to make code more robust.

Also you talk about functions managing facts, but facts can be stored as
well as computed. There's a great programming paradigm which I
particularly like, which has decomposition, and binds functions to data
at various levels of abstraction. It's called object-oriented
programming! ;-)

Cheers,

Calum
--
Calum Grant - LINK
 

 
Robert Maas, http://tinyu
PostPosted: Fri Sep 26, 2008 3:11 pm    Post subject: Re: Speculation about Dijkstra's primitives and how small ea
       
REM> Dijkstra ... combine ... in only three ways:
REM> Sequence ... Decision ... Repeating loop
REM> Extreme advocates for the language 'Forth' decided that each 'word' ...
REM> must combine atomic actions in just one (1) of those three ways.
REM> But tonight I got a new idea: ... Perhaps there should be a
REM> Dijkstra/Forth style restriction that each function must know at
REM> most one fact.

Quote:
From: Calum Grant <n...@hotmail.com
Well if you are interested in managing "facts" then Prolog is a great
place to start. ...

I don't see how that directly relates to my point. My idea is that
people who write computer software would make sure to introduce
some fact (about how to implement an algorithm, such as the fact
that a particular kind of data record has three fields with various
names and data types) in only one place per such fact. Thus for
example there'd be an accessor who knows the implementation of
LastName within a Person record, and nobody else would know that
particular implementation, everyone else must go through the
accessor method. Or a sorted map might be implemented as a MBBT or
as a red/black tree or as an AVL tree, and nobody outside that
particular sorted-map class needs to know which of those three
algorithms is actually being used. Or a conversion package between
English and metric units knows the conversion rate between pounds
and kilograms, and even it knows that in just one place, mutiplying
or dividing as appropriate, or generating an inverse once to allow
multiplying at all times, rather than having that constant
handcoded in both direct and inverted form separately.

I don't see how Prolog will help much with telling the person
writing the software when he/she has made a mistake and coded the
same "fact" or related "facts" such as conversion and inverse
conversion in two separate places.

Quote:
As for restricting the responsibilities of functions - in OOP
there's a principle of single responsibility, so each
class/function has one responsibility. E.g. if you have a class
"egg_and_beans" then that's two responsibilities and the class
should be split into two. However if you have a class "breakfast"
then that's coherent, and can again be one class.

Yes, that's part of the basis for my generalization of that idea.

Quote:
The comment on classes could equally apply to functions. They should
have one clear responsibility to avoid confusion and maximise reuse.

Yes, that was part of my intent. It doesn't matter whether we're
dealing with what are called in various languages "functions" or
"methods" or "words" or "procedures" or "subroutines" etc., it's
all the same in regard to this principle of single owner of the key
fact that drives the algorithms.

Ideally it should be possible to abstract out what are usually
design patterns to become single-location abstraction-patterns,
such as might be implemented as "macros" (actually
source-parse-tree transformations prior to execution) in Lisp. C++
made a crude attempt in this direction with "templates", but IMO
that attempt is so narrow-minded as to be grossly insufficient for
this grand plan. For example, there's a design pattern of callable
producer and calling consumer for stream/sequence data (Java notation):
handle = new HandleOnStream(params);
while (handle.hasNext()) {
item = handle.getNext();
process(item);
}
or alterately with late signal of end-of-stream, needed for
filtered streams where you can't know in advance whether you will
be able to get more filter-valid data or not:
handle = new HandleOnStream(params);
while (true) {
item = handle.getNext();
if (item == NULL) break;
process(item);
}
It shouldn't be necessary to copy&paste either of those templates
over and over and over. All you should need is a "macro" for either
design pattern that takes parameters for each variable part and
builds the corresponding code automatically (Lisp notation):
(iterateOverStream #'HandleOnStream listOfParams #'process)
That assumes that new HandleOnStream returns an iterator object
which satisfies the interface of supporting getNext which returns
NULL whenever end-of-sequence has been reached.

Quote:
I've philosophised about this "paradox" that classes can do more
yet have fewer responsibilities.

I don't believe classes do more, at least not in a D/P sense.
Exactly the same "business logic" (data-processing code) gets
executed regardless of whether you have formal OOP methods in OOP
classes, traditional Lisp functions, traditional Algol procedures,
tradition FORTRAN subroutines, or even early-machine-language
hackery to fake subroutines that hadn't yet been invented.

Quote:
What constitutes a "fact"?

Something problem-domain-related that is needed to perform a step
of D/P (data processing), whether it's a data tranformation such as
net = gross - cost, or a logical decision such as anyone over 30
hours per week must not be hourly, or a resource locator such as a
filename or URL or name of table within database etc.

Now you can say that holding a filename in a function and then
calling that function merely begs the question. Instead of needing
to know the fileame, the calling code needs to know the name of the
function to call. But needing to know the filename itself is not as
robust as needing to know the function name, because functions can
be kept in packages in a hierarchial way such that there are a very
limited number of undetectable mistakes, whereas filenames are more
easy to accidently get wrong and thereby cause havoc in the file
system. Also functions have parameter types, which can be checked
at compiletime and/or runtime to *detect* most possible mistakes
and signal a error for orderly shutdown of the program before havoc
ensues. The error (exception) can include enough legible
information to quickly track down the coding mistake.

Quote:
You might need a hierarchical concept of facts, which matches
your levels of abstraction. But that's just structured
programming.

Well, if each fact is encapsulated into a separate function, either
the actual fact there if "constant", or a link to a configuration
entry if potentially different from year to year in a way that can
be anticipated, then you don't need an extra hierarchy to match the
structued program, the structured program gives the hierarchy to
you already. Now if configuration entries are structured, to make
easy to write software that uses new items not previously used or
to find already-used items and follow the links back to the
software unit that is responsible for all uses of it, that might be
a good idea if it's not a burden to implement.

Quote:
The flip of decomposition is composition. You might also need to
combine facts in different ways, and you're back to logic
programming which can express such things nicely.

Ideally generic algorithms (such as C++ "templates", and Lisp
"macros" implementing "special forms"), and ordinary functions can
be defined separately, once each, and then configuring their
combination for a special application is rather trivial. In some
cases you have some parameters that are fixed at combination time
(they are a constant for this application) and other parameters
that remain variable (so that the resultant fuction can be applied
several different ways within the overall application), so in
effect you are doing high-level currying. Is that what you have in
mind?

Quote:
I'd draw you away from the idea of "facts", since unless you are
doing logic programming, you are really interested in the "purpose"
of a component, and functions do more than manage facts.

It sounds like you misunderstood my use of "fact". As I used the
term, one major fact is the way that a merge-sort is done. You
break the overall file into short runs or singleton runs then merge
them together 2->1 until only one run remains. This breaks down
into several facts, how you make the initial runs, how you do a
single merge, and how you organize the cascade of runs until only
one remains, so perhaps three facts are inherent to this algorithm
itself, with external (known elsewhere) facts related to how linked
lists are manipulated and how the sorting function works and where
the key is found within the record. All of those external facts
could be supplied as functional parameters to the merge-sort
algorithm, where each functional parameter must satisfy some
particular interface. (In the SORT function in Common Lisp, the
list-processing primitives are taken a priori, CAR/CDR/CONS, but
the comparison function and key function are given as functional
parameters. This works only because list-processing is totally
generic type, a single type of CONS cell that handles all possible
elements in the CAR position, so you can do *all* variants of
sort-merge with a single CAR/CDR/CONS set of functions. In a
function that doesn't have generic elements in CAR position, we'd
need a separate set of list-processing primitives for each type of
object a list can hold, so the list-processing set-of-functions
would need to be a parameter also.)

Quote:
As for each function knowing one "fact", there's a related
principle called the "Law of Demeter" (LoD). As a guiding
principle it's very useful, but is probably too much work to
observe strictly.

I looked it up:
<http://en.wikipedia.org/wiki/Law_of_Demeter>
This seems closely tied in with the concept of interfaces. I don't
necessarily mean the formal type of interfaces in Java and Seed7,
rather just the idea that the API for a software library won't
change drastically, that code you write assuming a particular
calling interface will remain valid indefinitely. So when you are
writing code that directly deals with handles/pointers on data
objects managed by some other module, you stick to calling
functions/methods per the published API and don't make any
assumptions about what happens "behind the curtain" or "under the
hood". (Unlike the Wizard of Oz or your personal automobile, the
software library somebody else wrote is generally actually working
as advertised, modulo just a few bugs that have been discovered
from time to time, and doesn't break down over time.) And of course
that the API is "complete" in the sense of providing all the
functionality that you would need from that type of data object.
(Note: By "type" I can mean either formal OOP class or intentional
type interpreted on top of a primitive or OOP class.)

Quote:
The fundamental notion is that a given object should assume as
little as possible about the structure or properties of anything
else (including its subcomponents).

I hope this allows for an API that returns components of specific
types, whereupon the caller can then call whatever library mangages
*that* type of data object for further processing of it? For
example, a PERSON object might have a way to finding out the NAME,
and such a NAME object might have a way of finding out whether the
name is Western or Oriental or Nickame or some other sub-type, and
in each case appropriate methods would be available to learn the
family name or given name etc. as much as possible for each such
sub-type. Accessors for components wouldn't *always* be desirable,
just in some cases, OK? And some accessors might require a password
or crypto-key etc., or a specific authorization certificate
public-key signed by an authority above the module managing the
confidentia data. And the return value might not be the data value
itself but a public-key encrypted version of the data which only
the authorized caller can decipher. OK? (Actually the ability to
return an OOP object rather than just a primitive type such as
string supports such data-access protection.)

Quote:
LoD states that functions should not make assumptions about the
internals of the objects it uses. It forbids code like
toaster.handle.depress() (since you shouldn't probe the specific
features of the toaster), and prefers code like
toaster.toast(bread), thereby pushing the implementation into the
object providing the functionality.

Maybe in this specific example the call is too knowledgeable of
internals, that all toasters have a "handle" and that such handle
when depressed performs a useful function that is the same across
all toasters. But I'm not convinced this syntax should be forbidden
in all cases. For example, should it be forbidden to contact the
CEO of a company by doing company.ceo.sendMessage("...") ? If
companies have several officers, each of which is a person, with
methods for getting each such person object, and if for each person
there's a preferred (default) e-contact method, but each person
also has several other methods such as getName or
checkIfAnyReceivedMessages or getImages etc., what's wrong with
first getting this or that person-object related to the company
then calling one of the person methods on that person-object? I
don't see why you would be required to tell the company object
directly the message you want to send to the CEO, or tell the
company every time you want to see a photo of the CEO, and I don't
see why the company object should have a gazillion methods such as
sendCeoMessage sendCfoMessage getCeoImages getCfoImages etc.
I think it should be a matter of judgement which is better,
mainObject.relatedObjectAccessor.doSomethingWithRelatedObject or
mainObject.invokeUltimateTaskSomehow for any particular case.

Quote:
Actually that makes sense, since a future toaster could use a
laser-beam to toast, in which case is [sic] doesn't have a
handle, but could provide a toast() method.

I have a "toaster oven" that requires rotating a timer knob instead
of depressing anything. I imagine anything with a laser would
likewise be multi-function, not just a "toaster".

Quote:
LoD is designed to make code more robust.

Whether it accomplishes that goal is an open question, depending on
particular circumstances.

Quote:
Also you talk about functions managing facts, but facts can be
stored as well as computed.

I hope you're talking about the same topic I'm talking about,
namely facts in the sense of how to perform tasks, i.e. algorithms
and the primitive data that is essential to them. For example, the
fact of how to do a merge is the following algorithm: You have two
input streams, one output stream, and a comparison function that
tells you whether record#A is less than record#B in some sense.
While both input streams still have data, compare the next record
of each stream, and if record#A is less than record#B then move
record#A from input stream#A to output stream, else move record#B
from input stream#B to output stream. (The entire main merge
algorithm then follows by copying all the rest of the data from the
one remaining input stream that still has input to output the
output stream, thus has a second fact about what to do when one of
the streams is exhausted. Typically the *entire* merge algorithm
also has methods for opening each of the input streams and the
output stream and closing them at the end.) In general these
algorithmic facts are both stored and computed, i.e. you store the
algorithm in the main memory and the execute that code at the
appropriate time. I'm not sure that's what you are talking about.

Quote:
There's a great programming paradigm which I particularly like,
which has decomposition, and binds functions to data at various
levels of abstraction. It's called object-oriented programming!

AFAIK that's not correct. In OOP, data objects are bound to their
specific class objects, which in turn are bound to their parents,
and somewhere along the chain from the specific type to the most
generic type there's a link to the most specific method.

But OOP has a fundamental problem: It ties together all the types
of actions for a given class of object, forcing all that disparate
code to be located in one place (modulo the possibility of some
methods being linked to the most specific class and others to more
generic classes), while splitting up all the uses of a given action
according to all the various classes that implement it. The
opposite organization, of collecting together all the code for
doing the same task on lots of different kinds of objects, isn't
usually allowed. The other alternative, of having each
action+objectType combo defined in a separate place, also isn't
usually allowed.

OOP is one way of organizing software, and is not always the best way.
I accept that it's sometimes useful, but I don't glorify it as you do.

So let me know if you like my idea for explicitly separating the
concept of language-defined datatypes (a.k.a. "internal" or
"implementational" datatypes) vs. intentional datatypes, and
treating intentional datatypes as the most important when
organizing data-processing algorithms/softwareLibraries, and if
you'll offer to work with me to organize the world's archive of
useful software per which intentional datatypes each software
module operates upon, basically a re-organization of what I started
with my "CookBook/Matrix".
 

Page 1 of 1 .:.

Google
 
Webnews.only-4-geeks.com

Windows Update | C++ | C | PHP | JavaScript | Photoshop | Programming | Windows 2000 | Python | Windows XP | Object | Flash | Flash - ActionScript | Paint Shop Pro | Excel | PowerPoint | Access | Word | Windows 98 | Internet Explorer 6.0 | CorelDraw12 | Java | XML | asm x86 | Linux Mandrake | Linux RedHat | Outlook |  | news from newsgroups |_ | s

Web Templates

Awesome Website Templates ©

projekty domów Makumba - Big Cyc Diety łańcuchy śniegowe Tomaszow Mazowiecki