Encapsulation theory

This is a discussion on Encapsulation theory within the Object forums in Theory and Concepts category; S Perryman wrote: > "Ed Kirwan" <IAmFractal @ hotmail.com> wrote in message > news:g89um1$he0$1 @ aioe.org... > >>S Perryman wrote: > >>> The canonical definition of information hiding is that implementation >>> detail relating to a component is not visible to users of the >>> component. Good examples are Ada, CLU, Modula-2. > >>> The other extreme is where implementation detail is visible, and >>> its definition (or changes therein) impact the user. C++ being a >>> good example. > >>> Within that spectrum, there are variations on the theme (Java >>> and protected/private at the package level etc) . ...

Go Back   Application Development Forum > Theory and Concepts > Object

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #11  
Old 08-24-2008, 08:15 AM
Ed Kirwan
Guest
 
Default Re: Encapsulation theory

S Perryman wrote:

> "Ed Kirwan" <IAmFractal@hotmail.com> wrote in message
> news:g89um1$he0$1@aioe.org...
>
>>S Perryman wrote:

>
>>> The canonical definition of information hiding is that implementation
>>> detail relating to a component is not visible to users of the
>>> component. Good examples are Ada, CLU, Modula-2.

>
>>> The other extreme is where implementation detail is visible, and
>>> its definition (or changes therein) impact the user. C++ being a
>>> good example.

>
>>> Within that spectrum, there are variations on the theme (Java
>>> and protected/private at the package level etc) .

>
>> You raise another excellent point, of course: definitions.

>
> It is not sensible to redefine long-standing terms within a given subject
> domain. It only confuses people. And within the CS domain, the OO
> community have been very loose on this matter (a classic being equating
> encapsulation with information hiding) .
>

Agreed, re-wording canonical definitions generates confusion, though perhaps
such confusion is pardonable if the re-wording allows itself to be
measured against the original such that, if the re-wording offers no
advantage over the original, it can be safely ignored.

(Such a process may even strengthen our appreciation of the original
definition, just as falsifyable scientific theories are strengthened by
continued, unsuccessful efforts to find unpredicted results.)

In other words, we should state what we hope the re-wording will give us
that the old doesn't, and then try to demonstrate this increased utility.

The paper posits that a re-wording of both, "Encapsulation," and,
"Information hiding," allows us to prove - in certain circumstances - the
validity of some programming best practices currently held intuitively,
namely:

i) Minimise scope. This best practice advises that, for example, if a
program unit within a subsystem is only used by other program units within
the subsystem, then it should usually not be accessible outside that
subsystem.

ii) Optimise size. This best practice says that if our system consists of
two subsystems and one subsystem has ten times as many program units as the
other, then the larger subsystem should usually be decomposed into several,
smaller subsystems.

Neither is provable - to my limited knowledge - for any cases given the
current, canonical definitions.

Given the re-wording of information hiding from the earlier post, and the
re-wording of encapsulation as, "The placing of program units within a
subsystem," then for a simplified, idealised programming language, both best
practices can be proved in terms of the maximum possible number of source
code dependencies of a system. (These wordings are the lay translations of
the actual mathematical definitions used in the proofs.)

As you noted already, the maximum possible number of source code
dependencies, s, for an unencapsulated system is of n program units is
given by:

s = n(n-1)

Using the definitions above, we can show that the lowest figure for this
maximum possible number of source code dependencies where p is the number of
program units accessible outside an individual subsystem is given by:

s = n(2 * sqrt(np) -1 -p)

The lowest figure possible is when p=1; when p>1, this equation gives a good
approximation of the lowest figure.

It can also be shown that this approximation to lowest figure for the
maximum possible number of source code dependencies is achieved when the
number of subsystems, r, is given by:

r = sqrt(n/p).

If the utility of these equations is less than the confusion caused by the
re-wordings, then the re-wordings are indeed valueless (or worse).

All that notwithstanding, however, your point remains valid: I perhaps
should not have attempted a re-wording and merely introduced two new
concepts (something like, "Analytic encapsulation," and, "Analytic
information hiding").

>
>> The paper provides a definition of information hiding as, "The
>> restricting of forming dependencies on any particular program unit within
>> a subsystem from outside that subsystem."

>
> But you have to precisely define what dependency actually is.
> For any given component C, there can be dependency on Cs' :
>
> - interface
> - implementation
> - transitive closure (all the components that C depends on - directly or
> indirectly)
> - release granularity (the package etc to which C belongs to)
> - etc
>


That's yet another good point.

The dependencies discussed are non-transitive source code dependencies. I'll
update the paper to clarify this (as well as the other issues you've raised,
of course).

Flicking through my copy of Lakos's, "Large Scale C++ Software Design," I
see that software entities are, broadly, logical (e.g., a class) or
physical (e.g. a file); dependencies are categorised based on the type of
entity involved.

I intended the dependencies under discussion, therefore, to be logical
dependencies: if an instance of class A references class B, then both
classes can be described as a node and the reference can be described as a
dependency. This is irrespective of whether the reference is on an
implementation or interface, whether the reference is due to association or
inheritance.

If, as in Java, class A depends on interface Test and interface Test is
implemented by concrete class ConcreteTest, then A has a source code
dependency on Test but not on ConcreteTest.

(Of course, even the physical files of a system can also be described as a
directed graph, allbeit different from the logical view of that system, and
even that physical system - with its physical dependencies - obeys the
equations above.)

>
>>> For any discussion of any given prog lang P, you should be
>>> able to enumerate all the dependency relations that can exist
>>> between components in P.

>
>>> What do you think the set of relationships are for Java ??

>
>> Well, the paper doesn't give the set of relationships for Java but the
>> maximum number of source code dependencies and shows how this minimised
>> for
>> a certain number of packages each with a certain number of public
>> classes.

>
> The minimal structure possible for any graph is a tree.
> So for a dependency graph, the number of dependencies for a tree = N - 1.
> The dependency bounds (D) for any directed graph is by definition :
>
> (D(N) = 0) = (N = 1)
> (N > 1) AND ( D(N) / (N - 1) IN [1,N] )
>
> What definition of dependency you use, a system of components will produce
> a value in those bounds. For example, the optimal encapsulation for
> release granularity would be one that recursively defines encapsulations
> as trees.
>
>
> Regards,
> Steven Perryman


Regards,

Ed

--
Encapsulation theory fundamentals:
www.EdmundKirwan.com/pub/paper1.pdf
Reply With Quote
  #12  
Old 08-26-2008, 07:43 AM
S Perryman
Guest
 
Default Re: Encapsulation theory


"Ed Kirwan" <IAmFractal@hotmail.com> wrote in message
news:g8rj9s$32u$1@aioe.org...
>S Perryman wrote:
>
>> "Ed Kirwan" <IAmFractal@hotmail.com> wrote in message
>> news:g89um1$he0$1@aioe.org...
>>
>>>S Perryman wrote:

>>
>>>> The canonical definition of information hiding is that implementation
>>>> detail relating to a component is not visible to users of the
>>>> component. Good examples are Ada, CLU, Modula-2.

>>
>>>> The other extreme is where implementation detail is visible, and
>>>> its definition (or changes therein) impact the user. C++ being a
>>>> good example.

>>
>>>> Within that spectrum, there are variations on the theme (Java
>>>> and protected/private at the package level etc) .

>>
>>> You raise another excellent point, of course: definitions.

>>
>> It is not sensible to redefine long-standing terms within a given subject
>> domain. It only confuses people. And within the CS domain, the OO
>> community have been very loose on this matter (a classic being equating
>> encapsulation with information hiding) .
>>

> Agreed, re-wording canonical definitions generates confusion, though
> perhaps
> such confusion is pardonable if the re-wording allows itself to be
> measured against the original such that, if the re-wording offers no
> advantage over the original, it can be safely ignored.
>
> (Such a process may even strengthen our appreciation of the original
> definition, just as falsifyable scientific theories are strengthened by
> continued, unsuccessful efforts to find unpredicted results.)
>
> In other words, we should state what we hope the re-wording will give us
> that the old doesn't, and then try to demonstrate this increased utility.
>
> The paper posits that a re-wording of both, "Encapsulation," and,
> "Information hiding," allows us to prove - in certain circumstances - the
> validity of some programming best practices currently held intuitively,
> namely:
>
> i) Minimise scope. This best practice advises that, for example, if a
> program unit within a subsystem is only used by other program units within
> the subsystem, then it should usually not be accessible outside that
> subsystem.


Not so, because the above conflicts with *utility* .
The fact that a component is *currently* used only within one unit of
encapsulation does not mean it should be confined to that unit.

And this is where your problem with information hiding comes.
The concept in effect :

1. removes dependency on implementation decisions
2. (implicitly) protects from component users violating the state of said
implementations (you cannot access what you cannot see) .

So 2 becomes a much stronger criteria for determining access to such
components. And 1 is prog-lang dependent (again going back
to my statement that you must define the particular prog lang and dependency
relations) .


> ii) Optimise size. This best practice says that if our system consists of
> two subsystems and one subsystem has ten times as many program units as
> the
> other, then the larger subsystem should usually be decomposed into
> several,
> smaller subsystems.


To manage complexity (understanding specifically - which is a human issue) .


> Neither is provable - to my limited knowledge - for any cases given the
> current, canonical definitions.


Is it not provable that :

- uncontrolled access to components +/- their properties can result in s/w
that
has compromised its own correctness.

- depending on the prog lang, dependency (direct or transitive) on
components
that are not referenced by a user UC nor the component itself, has
quantifiable
measures for the extent to which system change will affect UC.


> Given the re-wording of information hiding from the earlier post, and the
> re-wording of encapsulation as, "The placing of program units within a
> subsystem," then for a simplified, idealised programming language, both
> best
> practices can be proved in terms of the maximum possible number of source
> code dependencies of a system. (These wordings are the lay translations of
> the actual mathematical definitions used in the proofs.)


> As you noted already, the maximum possible number of source code
> dependencies, s, for an unencapsulated system is of n program units is
> given by:


> s = n(n-1)


> Using the definitions above, we can show that the lowest figure for this
> maximum possible number of source code dependencies where p is the number
> of
> program units accessible outside an individual subsystem is given by:


> s = n(2 * sqrt(np) -1 -p)


The lowest number of possible component dependencies = N - 1.
As I stated, this is when the dependency graph is a tree.


Regards,
Steven Perryman


Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 12:23 AM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.