| Feb. 22, 2007: I intend to start posting fairly often here at |
Recent posts: |
First posted on Wednesday, March 31, 2010
Nontrivia
Recentest significant change: June 19, 2010.
In my previous post "Unsettlings" I discussed a double opposition, or "double chiasm" as I called it, among the (cognitive) lights in which a given phenomenon would seem (1) simpler or (2) more usual or normal or (3) clearer, more clarificatory, more significant or informative, or (4) deeper, less trivial:
The first three correlate pretty obviously to mathematics of optimization, mathematics of probability, and mathematics of information. The fourth (nontriviality, depth, etc.) seems to me to correlate to mathematical logic.
Inverseness between probability and information. A message's quantity of information, its amount of informativeness or "newsiness," reflects the improbability of that message before it was sent. The information quantity is not simply 1 minus the message's erstwhile probability (e.g., 100% minus 30% probability equals 70% improbability), but still it's pretty simple, the logarithm of the reciprocal of the erstwhile probability, and we can think of information as a kind of inverse of probability. It goes up when the erstwhile probability goes down, vice versa, and so on. Well, actually it's a little more complex than that. The information is measured as a logarithm to a given base. If the base is not specified, then the logarithm is telling you something like message length, e.g., how many (instances of) symbols. Four quaternary units of information are 16 times more information than four bits (binary units) - but same message length. In amount of information, four bits (binary units) of information equal two quaternary units of information. In Peircean terms, the message length corresponds to the number of individual instances (or individual "replicas") of symbols; the base corresponds to the number of general "replicas" of symbols on which the message depends (binary 0 and 1, trinary 0, 1, and 2, etc.). That said, onward.
So, inverseness between optimality and nontriviality? If one puts, as I do, optimality and nontriviality/depth likewise into an opposition, one might expect a similar kind of inverseness. Lloyd's and Pagels's idea of thermodynamic depth is "the entropy of the ensemble of possible trajectories leading to the current state" (from Cosma Shalizi's notebook on complexity measures) and gets us the idea of some sort of opposite or inverse of the shortest path (simple, optimal, etc.). Then there is the idea of algorithmic complexity, the shortest program capable of obtaining a given result, which complexity is uncomputable because of the halting problem, and anyway the general idea that you can't get a ten-pound theorem out of five pounds of axioms (as discussed by Chaitin). So by merely looking for "big-picture" patterns (and roving through things like the Mathematical Subject Classification), I seem, despite my amateurish ignorance, to have found myself in the right neighborhood.
Shalizi, with a bluntness that is helpful to the general reader, starts out his above-linked notebook "Complexity Measures" with this striking paragraph:
Anyway, so maybe it's the same for the nontrivial as for the optimal. One doesn't typically seek an amount of nontriviality, instead one typically seeks nontrivia, complexuses, etc. Now, it's not so hard to understand what constitutes an optimal case, a probable case, and an informative or "newsy" case. But, if nontrivia are to be considered as being on some sort of par with optima, probabilities, and information, then what constitutes a nontrivium, a nontrivial case?
Now, there are some other big-picture considerations here. I'm thinking philosophically, analogically, so please bear with me. There are temporal issues involved with the conceptions of optima, probabilities, and information.
1. Optima and feasibles are, for lack of a better word, potentialities (with the optima as "debentialities," lowest or most efficient potential expenditures, what would really be owed) for what could happen or be done given things as they stand; the impact of directly revealing or acting if one were to reveal or act now (the moment of decision); correlated more or less to the surface of the future light cone.
2. Probabilities pertain to what is going to happen in the course of a future in virtue of repetitions; that which does happen thereby reaches 100% probability.
3. Information is newsiness and pertains to what is coming to light or being actualized (correlated more or less to the surface of the past light cone) but not already settled; if the message's information is already known, then the information is zero.
So we have this pattern (of characterizations, not definitions):

1. Optima, most feasible, simplest, most efficient, etc. — things worth supposing, imagining, etc.
2. Probabilities — things worth expecting.
3. Information — things worth noticing.
Ergo (by completing the analogy):
4. Nontrivia — things worth remembering. This associates the nontrivial or deep with truth or fact in some sense, as well as with the complex, the complicated, etc. Some mazy and labyrinthine complications have a kind of triviality when they don't teach real lessons, still they can be worth remembering — ask any lab rat. The idea of that which offers lessons worth learning, remembering, etc., that which is "educational" in some sense, that from which lessons or more or less secure conclusions can be drawn, is another thing which distinguishes the nontrivial from the distinctive, informative, etc. We learn from the past; experience is the great teacher. But can it be that the nontrivial case is simply that datum, fact, or basis (e.g., some postulates) from which one can draw conclusions? What is the complexity in it - simply that it is non-tautologously true? This seems to be missing something in that which mathematicians mean by "nontrivial" and "deep."
There's another big-picture issue — what you might call that of subjective nontriviality versus objective nontriviality but which would better be called aspectual nontriviality versus transpectual nontriviality. This could be a newsy distinction, since I haven't found any notice of it as a possible source of confusion. Take a nontrivial equivalence between mathematical propositions — its nontriviality is a nontriviality in outward aspect for the same reason as that behind mathematicians' joke that anything proven is trivial. I don't want to call it "subjective" since that would imply incorporating a subjective judgment into the reasoning itself about a mathematical structure, just as "subjective probability" suggests trying to quantify one's subjective expectation in a specific case. As for "transpectual," I just mean that as the opposite to "aspectual": if two statements are different in form but logically (or as it is sometimes said, "formally") equivalent, then they are different aspectually but the same transpectually (i.e., when you look through them enough). The difference between a good deductive proof and a circular deductive proof that assumes what it purports to prove istranspectual, not aspectual (even though in a good deductive proof the conclusion is in the premisses in a sense), because, unlike the good deductive proof, the circular deductive proof includes unestablished information (its conclusion in some form) in its premisses, while the good deductive proof includes only established information (including its conclusion in some form) in its premisses. Update: Now I think that the difference is neither aspectual nor transpectual, and that the aspectual-transpectual dichotomy is not so general as I had supposed. One's actual knowledge or ignorance of what is already implied does not depend simply on what is already implied, and one's actual knowledge or igorance does not lend the conclusion an "aspect" in the same sense as the notably persistent novel aspect of "Therefore Socrates is mortal" as deduced from its usual premisses. Equipollence (equivalence between propositions) is a transpectual simplicity; mutual non-implication is a transpectual complexity. Independence (or as some express it, independence and consistency) among axioms or postulates is a transpectual complexity. Nontriviality as a criterion of value of equipollential inferences is ironic, and is ironic and aspectual in the same way as analogous criteria for other modes of inference. (The ironic aspectual criteria may be used merely intuitively in devising methods of reasoning; whether one employs a method of incorporating specific subjective judgments of amount of likelihood or whatever into reasoning is another question, one which I'm not really addressing.) Examination of the pattern may lend some subjective probability to my claim.
(Note: mathematical conclusions are often through equipollential deduction. For a common example, the induction step in mathematical induction is equipollential: the conjunction of the ancestral case and the heredity is equipollent to the conclusion. The conclusion is a universal hypothetical (in form) while the ancestral case is an existential particular, but the equipollence is intact because the existence of the well-ordered set to whose elements the hypothetical conclusion refers is already assumed and usually actually already proven. In a simpler case than mathematical induction, in a nonempty universe "whatever there is, is blue" (hypothetical in form) validly implies the existential "there is something blue." Proofs of the ancestral case and the heredity are often through equipollential deductions, though sometimes not so, especially when greater-than or less-than statements get involved.)
1. A surmise to a best or "optimal" explanation seeks a kind of aspectual optimality, one for which we do not expect a deductive standard optimization algorithm. Such a surmise, seeking simplicity, is actually complex, and usually both adds and removes information (or data, or howsoever you wish to think of it). That's ironic. It's as if such surmise were seeking to compensate for its own complexity by seeking simplicity. Some speak in this connection of parsimony in hypothesis-formation, but the desirable simplicity should not be confused with "logical simplicity" as Peirce notes (see sidebar) — rather it's an idea of that which is most natural, "facile" as Peirce says or feasible. This is not only for hypotheses in the usual sense. Insofar as any theory's bad match to experimental results (in physical, material, and biological sciences and in human and social studies) can be explained away by additional hypotheses, there's always a role for the simplest expanation — the simplest "hypothesis" to account for a theory's persistent bad results is that the theory is wrong.
2. An induction to a trend seeks a kind of aspectual probability, one for which one does not expect a deductive standard probability measure (well, maybe in Bayesian probability, which I don't understand well enough to discuss; but those "priors" are not arrived at deductively). A lot of statistical inference seems to involve avoiding the pitfalls of reliance on subjective probability, while still giving us something aspectually probable that satisfies our desire for something like it; a frequentist might say that good statistical inference attempts not to formulate our expectations but rather to tell us what, if anything, is worth expecting, to the extent that that's just a way of talking about the light (if any) cast upon the future by objective ratios among cases. Actually an induction (adding but not removing information) increases not probability but information. That's ironic. It's as if induction were seeking to compensate for its informativeness by seeking likeliness.
3. A syllogistic or other "forward-only" deduction seeks to bring information to light — but it doesn't really increase information, it reduces it. That's ironic. There's little that I can find about efforts to quantify the "psychological novelty" (as various folks have called it) or "new aspect" (as Peirce called it) or seeming informativeness of a forward-only deduction's conclusion. It's aspectual. (Maybe there's a Bayesian way!) Another way to look at it is that a forward-only deduction increases probability (if the premisses are assigned probabilities beween 0% and 100%); in order to be true, it doesn't absolutely need all (or sometimes any) of the premisses to also be true. Anyway, it's as if forward-only deduction were seeking to compensate for its decrease of information (or increase of probability) by seeking newsiness.
4. So, continuing the pattern, the nontriviality of a deduction through equivalences or equipollences will be aspectual and ironic. Equipollential deduction neither adds nor removes data, and it's as if it were seeking to compensate for that simplicity with a kind of complexity in the sense here called aspectual. The transpectual complexity or complexus will involve independences, mutual non-redundancies, etc. It is surmise (by which I mean inference that both adds and removes information) that is transpectually nontrivial, even though surmise ironically seeks a kind of aspectual simplicity, naturalness, etc. Now, an aspectual informativeness (psychological novelty, new aspect, whatever one wishes to call it) is sought through a syllogistic or other forward-only deduction — an extrication of information by removing some of the clutter, so to speak, of the premisses. That (aspectual) informativeness is not to be confused with its kin, the (aspectual) nontriviality that is sought through equipollential deduction; that nontriviality consists (as far as I can tell) in the outward disparities of things bridged by a proven equipollence, a bridge which one may wish to cross and recross in either direction.
But can't I do better than that? Here I'm describing as aspectual the typical sense of "nontrivial" in mathematical talk but, but what makes something "transpectually" nontrivial? Is that kind of nontrivial simply a set of independent facts or truths or givens, i.e. they couldn't have proven or disproven one another? Do they have to be "facts or givens worth remembering" or is it enough that their interrelations are facts or givens worth remembering? Are such complexuses really the core of logical ideas such that logic should have been named for them ("nontrivium theory" or "complexus theory" or whatever), just as probability theory is named for probability, and so on? They may be optimal or otherwise (or more precisely, perhaps, they may be such that they would have been optimal or otherwise); but they are the paths which have been traveled, the structures which have been built. Is that it? Is a "transpectually nontrivial" statement simply one that is consistent and materially true, just not tautologously true? But isn't logic about formal truth, not material truth?
Actually that's not what bothers me. Basic deductive logic is about deducing material truths from other material truths - more or less, facts from facts, be the basal facts postulated or established observationally or merely supposed as premisses for the sake of argument. In that sense deductive logic is about material or nontautologous truths in the same sense that probability theory is about probabilities (and optimization theory is about optima, and information theory is about information). I like that idea of transpectual nontriviality: it avoids suggesting that lengthy convolution of an argument is the essence of nontriviality or depth and somehow logically "better," more "logicful," when in real life such an argument is riskier, less likely to escape a weakest-link problem. Such convolution increases aspectual nontriviality (sometimes only in a superficial way, to boot), not transpectual nontriviality, much less security or factuality.
The conclusion of a probability is not the probability of that conclusion, and the same goes for the conclusion of a nontriviality and the nontriviality (or lack thereof) of that conclusion as a conclusion. A syllogistic deduction does not turn its concluding proposition into a formal truth (or a logical truth, if you prefer). The fact that Socrates is mortal is a material truth even if deduced from other material truths, even if deduced from a postulate or axiom that Socrates is mortal. If it is postulated that Socrates is mortal in advance of premisses, a premissual proposition "Socrates is mortal" is part of the tautology "Socrates is mortal by the postulate that Socrates is mortal", but the fact, the datum, that Socrates is mortal is not tautologously true. The nontrivium is that basis on which conclusions - further bases - can be drawn. This is a kind of basality which is not the same thing as basicness or fundamentality. A set of such postulates, or, say Euclid's five postulates, independent (and consistent), have more transpectual nontriviality or depth than any single such postulate. So, if nontriviality can't be usefully quantified, maybe it can at least be ordered. Add a postulate, enrich or deepen the system - transpectually if not aspectually. (Should one say that Gödel statements are transpectually nontrivial but aspectually trivial in the mathematical system in which they are true but unprovable?) Even an axiom of propositional logic is not completely trivial, when it is introduced as an axiom, though from it by itself there follows little if anything. Those considerations may seem a bit slippery but they're not what bother me.
What bothers me is that in a sense I'm saying that "transpectual" nontrivia are basically data, givens, facts, i.e., such that one can draw conclusions from them (well, that's the good part), but data are often quantified just like information, in bits, bytes, etc.; so, are data really something different from information or are they merely information such that one doesn't demand that they be new, previously unknown, etc.?
Maybe I shouldn't make a big deal about it, and I already fear that this is one of the most ignorance-parading posts that I've ever written. After all, as I mentioned, there's a duality between optimization and probability where cost (a kind of lowness of feasibility) corresponds to probability. (To repeat myself: one would think it more intuitive that cost would correspond to improbability formulated as 1 minus probability, but I don't know whether that leads to problems or is merely less convenient for expositing the duality.) An amount of information depends in a sense on what question was asked. Did a given horse win a race? Yes or no? That's one bit of information, as if the probability of the horse's winning had been 50% when it almost certainly was not. So maybe I shouldn't worry about data's seeming like information any more than about feasibility's seeming like probability. Now, as to a datum qua datum, we're concerned not with how newsy it is, how improbable it was before it happened, given that which was already known, etc., but with the complication or complexification that it brings (what would have been its "suboptimal" character before it happened) and what conclusions can be drawn from it. Some say that information is a difference that makes a difference. Perhaps one could say that a nontrivium is a basis for a further basis.
It also bothers me that this reduces complexity/complication to a kind of randomness. It's as if, in going conceptually from optima to probabilities to information to facts, one settles into a kind of heat death of material or non-logical truths. All I can think of at the moment is that the randomness is real in a sense, but that it's why it matters that the data be data, facts, givens in some sense, not just newsy announcements, or probables, or optima or feasibles.
Looking back at optima for a hint — maybe there's no standard way to quantify optimality, but one can often think of an optimum as a distance with a direction or directions — a shortest path for instance, or the location of a minimum of a curve, etc. Even if it's only a rough idea, still one discerns a pattern, one that I've noticed before:
optimum — difference
probability — ratio
information — logarithm
(Note that this blog's title does include the phrase "Speculation Lounge"!)
So one might expect, simply on the superficial appearance of the pattern, that for the nontrivial one might be able to think of it as the next in the series "difference, ratio, logarithm." As to the ordering "optimum, probability, information," I didn't reach that from considering the pattern "difference, ratio, logarithm." Instead I got it as part of a broad pattern (see table on right).
Well, it's hard to decide the next term after "logarithm" with confidence, when one expects only a four-term series (I expect it for various reasons including the fourfold correlations outlined earlier in this post). Now, subtraction (finding a difference) is the inverse of addition, and division (finding a quotient or ratio) is the inverse of multiplication. Yet, finding a logarithm is one of two inverses of exponentiation (raising to a power); the other being to find a root or base. A root with a direction? (Now I'm thinking of complex roots). A base? Multi-valued logic? MVL has not been a big, thriving field, so far as I can tell, but on the other hand fuzzy logic is a kind of MVL, so maybe I shouldn't speak so fast. Anyway, if you have a higher numeric base, a larger alphabet, a larger lexicon, etc., you can express things with more concision, in a sense you have increased memory capacity too, do you have an increase in some sense in that which is worth remembering (learn the ABCs, expand your vocabulary, etc.)? (I resist this in part because of the terminological coincidence between a numeric base and a basis for a conclusion. Is it just a pun of ideas?) The other alternative seems to be the hyperlogarithm, or maybe an endless series (hyperlog, hyper-hyperlog, etc.), some sort of orders of nontriviality; one starts thinking of powersets and so on. Now, all that I'm seeking here is an idea in terms of which we can think merely roughly of the nontrivial, but this sort of thing leaves me shaking my head as usual.
So I have to leave it here for the time being as it stands. It's a difficult question that has me taking shots in the dark.
(Note on the double-chiasm image near post's top: I've given Hyatt Carter total permission to use the image freely as he pleases, for example here.)
Recentest significant change: June 19, 2010.
In my previous post "Unsettlings" I discussed a double opposition, or "double chiasm" as I called it, among the (cognitive) lights in which a given phenomenon would seem (1) simpler or (2) more usual or normal or (3) clearer, more clarificatory, more significant or informative, or (4) deeper, less trivial:
| 1. Simplicity, optimality, etc. 2. Likeliness, probability, etc. | ![]() | 3. Informativeness, significance, etc. 4. Nontriviality, depth, etc. |
The first three correlate pretty obviously to mathematics of optimization, mathematics of probability, and mathematics of information. The fourth (nontriviality, depth, etc.) seems to me to correlate to mathematical logic.
Inverseness between probability and information. A message's quantity of information, its amount of informativeness or "newsiness," reflects the improbability of that message before it was sent. The information quantity is not simply 1 minus the message's erstwhile probability (e.g., 100% minus 30% probability equals 70% improbability), but still it's pretty simple, the logarithm of the reciprocal of the erstwhile probability, and we can think of information as a kind of inverse of probability. It goes up when the erstwhile probability goes down, vice versa, and so on. Well, actually it's a little more complex than that. The information is measured as a logarithm to a given base. If the base is not specified, then the logarithm is telling you something like message length, e.g., how many (instances of) symbols. Four quaternary units of information are 16 times more information than four bits (binary units) - but same message length. In amount of information, four bits (binary units) of information equal two quaternary units of information. In Peircean terms, the message length corresponds to the number of individual instances (or individual "replicas") of symbols; the base corresponds to the number of general "replicas" of symbols on which the message depends (binary 0 and 1, trinary 0, 1, and 2, etc.). That said, onward.
So, inverseness between optimality and nontriviality? If one puts, as I do, optimality and nontriviality/depth likewise into an opposition, one might expect a similar kind of inverseness. Lloyd's and Pagels's idea of thermodynamic depth is "the entropy of the ensemble of possible trajectories leading to the current state" (from Cosma Shalizi's notebook on complexity measures) and gets us the idea of some sort of opposite or inverse of the shortest path (simple, optimal, etc.). Then there is the idea of algorithmic complexity, the shortest program capable of obtaining a given result, which complexity is uncomputable because of the halting problem, and anyway the general idea that you can't get a ten-pound theorem out of five pounds of axioms (as discussed by Chaitin). So by merely looking for "big-picture" patterns (and roving through things like the Mathematical Subject Classification), I seem, despite my amateurish ignorance, to have found myself in the right neighborhood.
Shalizi, with a bluntness that is helpful to the general reader, starts out his above-linked notebook "Complexity Measures" with this striking paragraph:
C'est magnifique, mais ce n'est pas de la science. (Lots of 'em ain't that splendid, either.) This is, in the word of the estimable Dave Feldman (who taught me most of what I know about it, but has rather less jaundiced views), a "micro-field" within the soi-disant study of complexity. Every few months seems to produce another paper proposing yet another measure of complexity, generally a quantity which can't be computed for anything you'd actually care to know about, if at all. These quantities are almost never related to any other variable, so they form no part of any theory telling us when or how things get complex, and are usually just quantification for quantification's own sweet sake.Now, one may note that there also seems no general quantification of "optimality," either — instead, one seeks specific optima (idempotency of probability measures is involved in optimization, I guess that that's what one gets instead of a variable "amount" of optimality). As for a quantity of feasibility, it might just be a roundabout way of locating an optimum (a located feasible solution getting characterized, say, by a distance and direction from the optimum). If it's just lowness of cost (compared to a highest feasible cost) or size of net benefit (minus some lowest feasible net benefit, I guess) or some unifying generalization of those ideas (I don't know what), it still isn't like a ratio, comparable across disparate cases. If we set the optimum to unity in order to get that comparability, then can feasibility come out like probability? There's a duality between optimization and probability where cost corresponds to probability. (One would think it more intuitive that cost would correspond to improbability formulated as 1 minus probability, but I don't know whether that leads to problems or is merely less convenient for expositing the duality. Here's a paper (PDF) of which I understood maybe three sentences and one formula.)
Anyway, so maybe it's the same for the nontrivial as for the optimal. One doesn't typically seek an amount of nontriviality, instead one typically seeks nontrivia, complexuses, etc. Now, it's not so hard to understand what constitutes an optimal case, a probable case, and an informative or "newsy" case. But, if nontrivia are to be considered as being on some sort of par with optima, probabilities, and information, then what constitutes a nontrivium, a nontrivial case?
Now, there are some other big-picture considerations here. I'm thinking philosophically, analogically, so please bear with me. There are temporal issues involved with the conceptions of optima, probabilities, and information.1. Optima and feasibles are, for lack of a better word, potentialities (with the optima as "debentialities," lowest or most efficient potential expenditures, what would really be owed) for what could happen or be done given things as they stand; the impact of directly revealing or acting if one were to reveal or act now (the moment of decision); correlated more or less to the surface of the future light cone.
2. Probabilities pertain to what is going to happen in the course of a future in virtue of repetitions; that which does happen thereby reaches 100% probability.
3. Information is newsiness and pertains to what is coming to light or being actualized (correlated more or less to the surface of the past light cone) but not already settled; if the message's information is already known, then the information is zero.
So we have this pattern (of characterizations, not definitions):
1. Optima, most feasible, simplest, most efficient, etc. — things worth supposing, imagining, etc.
2. Probabilities — things worth expecting.
3. Information — things worth noticing.
Ergo (by completing the analogy):
4. Nontrivia — things worth remembering. This associates the nontrivial or deep with truth or fact in some sense, as well as with the complex, the complicated, etc. Some mazy and labyrinthine complications have a kind of triviality when they don't teach real lessons, still they can be worth remembering — ask any lab rat. The idea of that which offers lessons worth learning, remembering, etc., that which is "educational" in some sense, that from which lessons or more or less secure conclusions can be drawn, is another thing which distinguishes the nontrivial from the distinctive, informative, etc. We learn from the past; experience is the great teacher. But can it be that the nontrivial case is simply that datum, fact, or basis (e.g., some postulates) from which one can draw conclusions? What is the complexity in it - simply that it is non-tautologously true? This seems to be missing something in that which mathematicians mean by "nontrivial" and "deep."
There's another big-picture issue — what you might call that of subjective nontriviality versus objective nontriviality but which would better be called aspectual nontriviality versus transpectual nontriviality. This could be a newsy distinction, since I haven't found any notice of it as a possible source of confusion. Take a nontrivial equivalence between mathematical propositions — its nontriviality is a nontriviality in outward aspect for the same reason as that behind mathematicians' joke that anything proven is trivial. I don't want to call it "subjective" since that would imply incorporating a subjective judgment into the reasoning itself about a mathematical structure, just as "subjective probability" suggests trying to quantify one's subjective expectation in a specific case. As for "transpectual," I just mean that as the opposite to "aspectual": if two statements are different in form but logically (or as it is sometimes said, "formally") equivalent, then they are different aspectually but the same transpectually (i.e., when you look through them enough). The difference between a good deductive proof and a circular deductive proof that assumes what it purports to prove is
(Note: mathematical conclusions are often through equipollential deduction. For a common example, the induction step in mathematical induction is equipollential: the conjunction of the ancestral case and the heredity is equipollent to the conclusion. The conclusion is a universal hypothetical (in form) while the ancestral case is an existential particular, but the equipollence is intact because the existence of the well-ordered set to whose elements the hypothetical conclusion refers is already assumed and usually actually already proven. In a simpler case than mathematical induction, in a nonempty universe "whatever there is, is blue" (hypothetical in form) validly implies the existential "there is something blue." Proofs of the ancestral case and the heredity are often through equipollential deductions, though sometimes not so, especially when greater-than or less-than statements get involved.)
| Modern science has been builded after the model of Galileo, who founded it on il lume naturale. That truly inspired prophet had said that, of two hypotheses, the simpler is to be preferred; but I was formerly one of those who, in our dull self-conceit fancying ourselves more sly than he, twisted the maxim to mean the logically simpler, the one that adds the least to what has been observed, in spite of three obvious objections: first, that so there was no support for any hypothesis; secondly, that by the same token we ought to content ourselves with simply formulating the special observations actually made; and thirdly, that every advance of science that further opens the truth to our view discloses a world of unexpected complications. It was not until long experience forced me to realise that subsequent discoveries were every time showing I had been wrong, while those who understood the maxim as Galileo had done, early unlocked the secret, that the scales fell from my eyes and my mind awoke to the broad and flaming daylight that it is the simpler Hypothesis in the sense of the more facile and natural, the one that instinct suggests, that must be preferred; for the reason that unless man have a natural bent in accordance with nature's, he has no chance of understanding nature at all. Many tests of this principal and positive fact, relating as well to my own studies as to the researches of others, have confirmed me in this opinion; and when I shall come to set them forth in a book, their array will convince everybody. Oh no! I am forgetting that armour, impenetrable by accurate thought, in which the rank and file of minds are clad! They may, for example, get the notion that my proposition involves a denial of the rigidity of the laws of association: it would be quite on a par with much that is current. I do not mean that logical simplicity is a consideration of no value at all, but only that its value is badly secondary to that of simplicity in the other sense. — Charles Sanders Peirce, "A Neglected Argument for the Reality of God." |
2. An induction to a trend seeks a kind of aspectual probability, one for which one does not expect a deductive standard probability measure (well, maybe in Bayesian probability, which I don't understand well enough to discuss; but those "priors" are not arrived at deductively). A lot of statistical inference seems to involve avoiding the pitfalls of reliance on subjective probability, while still giving us something aspectually probable that satisfies our desire for something like it; a frequentist might say that good statistical inference attempts not to formulate our expectations but rather to tell us what, if anything, is worth expecting, to the extent that that's just a way of talking about the light (if any) cast upon the future by objective ratios among cases. Actually an induction (adding but not removing information) increases not probability but information. That's ironic. It's as if induction were seeking to compensate for its informativeness by seeking likeliness.
3. A syllogistic or other "forward-only" deduction seeks to bring information to light — but it doesn't really increase information, it reduces it. That's ironic. There's little that I can find about efforts to quantify the "psychological novelty" (as various folks have called it) or "new aspect" (as Peirce called it) or seeming informativeness of a forward-only deduction's conclusion. It's aspectual. (Maybe there's a Bayesian way!) Another way to look at it is that a forward-only deduction increases probability (if the premisses are assigned probabilities beween 0% and 100%); in order to be true, it doesn't absolutely need all (or sometimes any) of the premisses to also be true. Anyway, it's as if forward-only deduction were seeking to compensate for its decrease of information (or increase of probability) by seeking newsiness.
4. So, continuing the pattern, the nontriviality of a deduction through equivalences or equipollences will be aspectual and ironic. Equipollential deduction neither adds nor removes data, and it's as if it were seeking to compensate for that simplicity with a kind of complexity in the sense here called aspectual. The transpectual complexity or complexus will involve independences, mutual non-redundancies, etc. It is surmise (by which I mean inference that both adds and removes information) that is transpectually nontrivial, even though surmise ironically seeks a kind of aspectual simplicity, naturalness, etc. Now, an aspectual informativeness (psychological novelty, new aspect, whatever one wishes to call it) is sought through a syllogistic or other forward-only deduction — an extrication of information by removing some of the clutter, so to speak, of the premisses. That (aspectual) informativeness is not to be confused with its kin, the (aspectual) nontriviality that is sought through equipollential deduction; that nontriviality consists (as far as I can tell) in the outward disparities of things bridged by a proven equipollence, a bridge which one may wish to cross and recross in either direction.
But can't I do better than that? Here I'm describing as aspectual the typical sense of "nontrivial" in mathematical talk but, but what makes something "transpectually" nontrivial? Is that kind of nontrivial simply a set of independent facts or truths or givens, i.e. they couldn't have proven or disproven one another? Do they have to be "facts or givens worth remembering" or is it enough that their interrelations are facts or givens worth remembering? Are such complexuses really the core of logical ideas such that logic should have been named for them ("nontrivium theory" or "complexus theory" or whatever), just as probability theory is named for probability, and so on? They may be optimal or otherwise (or more precisely, perhaps, they may be such that they would have been optimal or otherwise); but they are the paths which have been traveled, the structures which have been built. Is that it? Is a "transpectually nontrivial" statement simply one that is consistent and materially true, just not tautologously true? But isn't logic about formal truth, not material truth?
Actually that's not what bothers me. Basic deductive logic is about deducing material truths from other material truths - more or less, facts from facts, be the basal facts postulated or established observationally or merely supposed as premisses for the sake of argument. In that sense deductive logic is about material or nontautologous truths in the same sense that probability theory is about probabilities (and optimization theory is about optima, and information theory is about information). I like that idea of transpectual nontriviality: it avoids suggesting that lengthy convolution of an argument is the essence of nontriviality or depth and somehow logically "better," more "logicful," when in real life such an argument is riskier, less likely to escape a weakest-link problem. Such convolution increases aspectual nontriviality (sometimes only in a superficial way, to boot), not transpectual nontriviality, much less security or factuality.
The conclusion of a probability is not the probability of that conclusion, and the same goes for the conclusion of a nontriviality and the nontriviality (or lack thereof) of that conclusion as a conclusion. A syllogistic deduction does not turn its concluding proposition into a formal truth (or a logical truth, if you prefer). The fact that Socrates is mortal is a material truth even if deduced from other material truths, even if deduced from a postulate or axiom that Socrates is mortal. If it is postulated that Socrates is mortal in advance of premisses, a premissual proposition "Socrates is mortal" is part of the tautology "Socrates is mortal by the postulate that Socrates is mortal", but the fact, the datum, that Socrates is mortal is not tautologously true. The nontrivium is that basis on which conclusions - further bases - can be drawn. This is a kind of basality which is not the same thing as basicness or fundamentality. A set of such postulates, or, say Euclid's five postulates, independent (and consistent), have more transpectual nontriviality or depth than any single such postulate. So, if nontriviality can't be usefully quantified, maybe it can at least be ordered. Add a postulate, enrich or deepen the system - transpectually if not aspectually. (Should one say that Gödel statements are transpectually nontrivial but aspectually trivial in the mathematical system in which they are true but unprovable?) Even an axiom of propositional logic is not completely trivial, when it is introduced as an axiom, though from it by itself there follows little if anything. Those considerations may seem a bit slippery but they're not what bother me.
What bothers me is that in a sense I'm saying that "transpectual" nontrivia are basically data, givens, facts, i.e., such that one can draw conclusions from them (well, that's the good part), but data are often quantified just like information, in bits, bytes, etc.; so, are data really something different from information or are they merely information such that one doesn't demand that they be new, previously unknown, etc.?
Maybe I shouldn't make a big deal about it, and I already fear that this is one of the most ignorance-parading posts that I've ever written. After all, as I mentioned, there's a duality between optimization and probability where cost (a kind of lowness of feasibility) corresponds to probability. (To repeat myself: one would think it more intuitive that cost would correspond to improbability formulated as 1 minus probability, but I don't know whether that leads to problems or is merely less convenient for expositing the duality.) An amount of information depends in a sense on what question was asked. Did a given horse win a race? Yes or no? That's one bit of information, as if the probability of the horse's winning had been 50% when it almost certainly was not. So maybe I shouldn't worry about data's seeming like information any more than about feasibility's seeming like probability. Now, as to a datum qua datum, we're concerned not with how newsy it is, how improbable it was before it happened, given that which was already known, etc., but with the complication or complexification that it brings (what would have been its "suboptimal" character before it happened) and what conclusions can be drawn from it. Some say that information is a difference that makes a difference. Perhaps one could say that a nontrivium is a basis for a further basis.
It also bothers me that this reduces complexity/complication to a kind of randomness. It's as if, in going conceptually from optima to probabilities to information to facts, one settles into a kind of heat death of material or non-logical truths. All I can think of at the moment is that the randomness is real in a sense, but that it's why it matters that the data be data, facts, givens in some sense, not just newsy announcements, or probables, or optima or feasibles.
Looking back at optima for a hint — maybe there's no standard way to quantify optimality, but one can often think of an optimum as a distance with a direction or directions — a shortest path for instance, or the location of a minimum of a curve, etc. Even if it's only a rough idea, still one discerns a pattern, one that I've noticed before:
optimum — difference
probability — ratio
information — logarithm
(Note that this blog's title does include the phrase "Speculation Lounge"!)
So one might expect, simply on the superficial appearance of the pattern, that for the nontrivial one might be able to think of it as the next in the series "difference, ratio, logarithm." As to the ordering "optimum, probability, information," I didn't reach that from considering the pattern "difference, ratio, logarithm." Instead I got it as part of a broad pattern (see table on right).
| Some sort of proportion or analogy here. | |
| Motion, forces. | |
| Matter. | |
| Life. | |
bases (for further conclusions). | Mind. |
So I have to leave it here for the time being as it stands. It's a difficult question that has me taking shots in the dark.
(Note on the double-chiasm image near post's top: I've given Hyatt Carter total permission to use the image freely as he pleases, for example here.)
. . . . |




