Coarse Grained

  • rss
  • archive
  • Phone Numbers and Precommitment

    This plotline, where Jerry gives Kramer a woman’s number to throw away and then asks for it back, actually seems more plausible today than it did when it aired in the early 90’s, though the mechanics would be different today.

    It used to be that any number called with any frequency would inevitably get committed to memory, since you had to punch in the whole number every time [1]. Now we simply enter it once to store it in our cell phones, promptly forgetting it.

    Even though we’re in a sense “losing” our ability to remember phone numbers, this shift has one major benefit - it allows for effective precommitment. Now If you decide you don’t want to ever call someone again, you can simply delete their number from your phone - you no longer have to resist the urge every time it strikes you.

    image

    Unfortunately, Jerry learns that Kramer is not an effective commitment device.

    I wonder if the ability to effectively precommit has had any social effects. I’d hazard a wager that, on balance people are overall happier if they’re prevented from calling an ex they know they shouldn’t in a moment of weakness/drunkenness.

    It’s easy to imagine things being designed with this limitation in mind. For instance, if the internet worked slightly differently and required typing in IP addresses instead of web addresses. Would internet addiction be less of a “thing” if it were possible to stop yourself from going to websites simply by deleting their links from your browser[2]?

    Are there any other cases where we can shore up one mental limitation (limited willpower/improper discounting) by way of a different mental limitation (weak memory)?

    -

    [1] Interestingly enough, this occured even though most landlines had phone storage in the form of speed dial. It’s not clear to me why speed dial didn’t serve as an effective number storage mechanism (in the same way that cell phones are), or why regular old number storage never became a key feature on landlines. A few guesses as to why:

    • Landline phones were commodity products built by electronics manufacturers who weren’t particularly interested in being innovative
    • Landline phones had an extremely long upgrade cycle (a phone built in 1960 can do most of what one built in 1990 can), so any new features would take a long time to penetrate. (This goes hand in hand with the first reason).
    • Speed dial had an awkward UI (the lack of a screen meant you couldn’t see the number you’re dialing, and that entering a number required remembering or figuring out how to do it), which prevented it from being universally used.
    • Speed dial could only store a few numbers, so you’d still end up having to remember or write down any numbers beyond that.
    • Speed dial was seen more of a “time saving device” (it’s right there in the name) rather than a “number storage device”
    • Number storage just isn’t seen as “sexy”, so it’s hard to get people to pay for it or care about it (this seems true-ish even today: number storage doesn’t seem to have ever been a touted feature).

    However, even though I don’t remember speed dial being much of a “thing”, it was still popular enough to become a plot point in a different Seinfeld episode:

    [2] This also suggests that effective search limits our ability to precommit - it’s hard to remove options from yourself when you have such a powerful tool for finding things.

    • 2 years ago
    • 1 notes
  • UX and Cognition

    From Ex-Apple Designer Creates Teaching UI That “Kills Math” Using Data Viz:

    Have you ever tried multiplying roman numerals? It’s incredibly, ridiculously difficult. That’s why, before the 14th century, everyone thought that multiplication was an incredibly difficult concept, and only for the mathematical elite. Then arabic numerals came along, with their nice place values, and we discovered that even seven-year-olds can handle multiplication just fine. There was nothing difficult about the concept of multiplication – the problem was that numbers, at the time, had a bad user interface.

    Source: fastcodesign.com
    • 3 years ago
    • 1 notes
    • #cognitiveprosthesis
  • Charity and Optimization Curves

    From Marginal Charity:

    We make many choices, both as individuals and as organizations; we choose prices, qualities, locations, etc. We often make such choices to maximize some sort of private gain, shown in red. Such private choices also usually have effects on the gain of the rest of the world, shown in black. In general the social gain curve peaks at a different point than the private gain curve, because there are usually many market failures associated with our choices. (As the absolute curve heights are irrelevant, I’ve arbitrarily let them intersect where private gain peaks.)

    At the choice that maximizes private value, a small change in the direction of raising social gain, as shown by the yellow arrow, comes at only a tiny loss in private gain. In fact, in the limit of going to the exact private gain maximizing choice, the ratio of the rates of change of social gain and private loss approaches infinity!

    The lesson: if you aren’t already doing it, by far the most cost-effective way to help the world is to shade your selfish choices just a little in the direction of making the world a better place. If you have market power when you sell a product, lower your price just a tad. If you have market power when you sell your labor, lower your wage a bit. Instead of choosing the profit-maximizing quality for your product or labor, increase that quality a little. If twenty floors would be the most profitable height for your apartment complex, add one more floor. And so on. (And maybe learn some econ, so you can better see which direction is good.)

    And from the comments:

    Some people are not getting that it’s all about the slopes of the respective curves near the max of the Private curve, not their values.  The slope of the Private curve is usually near zero in that region.  The slope of the Social curve is generally not near zero.  Given the definitions of the curves, this presents an opportunity to gain a lot of Social utility for very little reduction in Private utility.  Iterated over many decisions, much good can come of this for relatively little Private cost.

    Source: overcomingbias.com
    • 3 years ago
    • 1 notes
    • #econ
  • Design and Operations

    From The Insourcing Boom:

    “We got the water heater into the room, and the first thing [the group] said to us was ‘This is just a mess,’ ” Nolan recalls. Not the product, but the design. “In terms of manufacturability, it was terrible.”

    The GeoSpring suffered from an advanced-technology version of “IKEA Syndrome.” It was so hard to assemble that no one in the big room wanted to make it. Instead they redesigned it. The team eliminated 1 out of every 5 parts. It cut the cost of the materials by 25 percent. It eliminated the tangle of tubing that couldn’t be easily welded. By considering the workers who would have to put the water heater together—in fact, by having those workers right at the table, looking at the design as it was drawn—the team cut the work hours necessary to assemble the water heater from 10 hours in China to two hours in Louisville.

    In the end, says Nolan, not one part was the same.

    So a funny thing happened to the GeoSpring on the way from the cheap Chinese factory to the expensive Kentucky factory: The material cost went down. The labor required to make it went down. The quality went up. Even the energy efficiency went up.

    GE wasn’t just able to hold the retail sticker to the “China price.” It beat that price by nearly 20 percent. The China-made GeoSpring retailed for $1,599. The Louisville-made GeoSpring retails for $1,299.

    …The dishwasher’s initial assembly-line redesign was a primitive version of lean. The full-blown, sophisticated version has spread across Appliance Park, into the work of the engineers, the designers, the salespeople, the bosses. Another team took a design for a new dishwasher into a room and pulled it apart. As originally designed, the door had four visible screws. The marketing people on the team wanted the door to have no visible screws—they wanted it iPhone-sleek. The operators loved that idea—four screws is a lot of assembly-line work. The engineers and designers came up with a design that holds the door together with one hidden screw and a rod.

    “It’s easier to assemble,” says Calvaruso. “It’s cheaper. And the fit, feel, and finish are better.”

    If the people who design dishwashers sit at their desks in one building, and the people who sell them to retailers and consumers sit at their desks in another building, and the people who make the dishwashers are in a different country and speak a different language—you never realize that the four screws should disappear, let alone come up with a way they can. The story of the four disappearing screws on that dishwasher door is why Jeffrey Immelt has the confidence to spend $800 million to bring Appliance Park back to life.

    …GE is rediscovering that how you run the factory is a technology in and of itself. Your factory is really a laboratory—and the R&D that can happen there, if you pay attention, is worth a lot more to the bottom line than the cost savings of cheap labor in someone else’s factory.

    Source: The Atlantic
    • 3 years ago
    • #design
    • #operations
  • Information Scent

    FromDeceivingly Strong Information Scent Costs Sales:

    Information scent refers to the extent to which users can predict what they will find if they pursue a certain path through a website. The term is part of information foraging theory, which explains how users interact with systems using the analogy of animals hunting for food.

    Predators following a strong spoor are firmly convinced that they’ll find their prey at the end of the trail, and thus are less likely to be distracted and wander off the path.

    Similarly, if users are clicking through a site hunting for specific products or answers, they’ll keep going as long as they continue to find links that seem to take them closer and closer to their goal.

    Information scent can backfire if a strong attractor seems to be the answer, but isn’t. We found an example of this is in the teen area of kidshealth.org, which we recently tested in our study of how teenagers use the Web.

    During our test, several teenage users failed the simple task of finding out how much they can weigh without being considered overweight. One of the site’s articles, “What’s the Right Weight for My Height?” is a great example of microcontent: it’s explicit, short, and easy to understand. In addition to having a good title, the article is prominently featured in a site area entitled “Food & Fitness” – a label with attractive information scent for our users’ assigned task. The article also comes up fairly high on a search for “weight.”

    Great so far. Except the article doesn’t contain the answer to the question.

    Unfortunately, because the path to the article has good information scent, and because the article itself has very strong information scent, our users concluded that the site didn’t contain the required information. After all, they’d found the one place where this information ought to be, and it wasn’t there.

    If the scent of information is sufficiently pungent, people are generally convinced that they're looking in the right place. If that place doesn’t contain what they want, they’re likely to conclude that the site doesn’t offer it at all.

    Source: useit.com
    • 3 years ago
  • Universal Theory of Decisons

    From a LW Comment:

    I came up with this whole thing some years ago and dubbed it the Universal Theory of Decisions’ which, stated in one line, is: A decision is either easy or it doesn’t matter’.

    There’s a corollary, though, which I’ve never managed to get as succinct as the first bit: If the decision isn’t easy and does (seem to) matter, then you’re thinking about the wrong decision. This covers the situations like getting stuck deciding which university to go to. The real decision is usually something like ‘do I have enough information to make this decision?’, to which the answer is No, so you just get on and get more information: no agonising required.

    Someone pointed out recently that this Universal Theory of Decisions is closely related to Susan Blackmore’s 'no free-will’ approach outlined in The Meme Machine. Whether it is or not, I’ve found that the application frees up my time and mental energy to get on with things that are more productive. I do occasionally need to be reminded that I’m stressing over a decision that doesn’t matter, though. But then I guess that means I’m still human.

    Source: lesswrong.com
    • 3 years ago
    • #decisiontheory
  • Design and Geography

    The difficulty of design - sometimes you need to account for your latitude and longitude:

    There were at least three problems with the Mark 14 submarine launched torpedo…

    3) The magnetic exploder was designed in the northern latitudes and did not work as well at the equator. The British and Germans had already disabled their magnetic exploders before the USN ordered theirs disabled 24June43. ComSubSWPac had participated in the development of the magnetic exploder, knew the principle was sound, and resisted disablement until Dec'43.

    Source: ww2pacific.com
    • 3 years ago
    • #design
  • A-B Testing and Campaign Email

    From The Science Behind Those Obama Campaign Emails:

    The appeals were the product of rigorous experimentation by a large team of analysts. “We did extensive A-B testing not just on the subject lines and the amount of money we would ask people for,” says Amelia Showalter, director of digital analytics, “but on the messages themselves and even the formatting.” The campaign would test multiple drafts and subject lines—often as many as 18 variations—before picking a winner to blast out to tens of millions of subscribers. “When we saw something that really moved the dial, we would adopt it,” says Toby Fallsgraff, the campaign’s e-mail director, who oversaw a staff of 20 writers.

    It quickly became clear that a casual tone was usually most effective. “The subject lines that worked best were things you might see in your in-box from other people,” Fallsgraff says. “ ‘Hey’ was probably the best one we had over the duration.” Another blockbuster in June simply read, “I will be outspent.”

    …Writers, analysts, and managers routinely bet on which lines would perform best and worst. “We were so bad at predicting what would win that it only reinforced the need to constantly keep testing,” says Showalter. “Every time something really ugly won, it would shock me: giant-size fonts for links, plain-text links vs. pretty ‘Donate’ buttons. Eventually we got to thinking, ‘How could we make things even less attractive?’ That’s how we arrived at the ugly yellow highlighting on the sections we wanted to draw people’s eye to.”

    Another unexpected hit: profanity. Dropping in mild curse words such as “Hell yeah, I like Obamacare” got big clicks. But these triumphs were fleeting. There was no such thing as the perfect e-mail; every breakthrough had a shelf life. “Eventually the novelty wore off, and we had to go back and retest,” says Showalter.

    Fortunately for Obama and all political campaigns that will follow, the tests did yield one major counterintuitive insight: Most people have a nearly limitless capacity for e-mail and won’t unsubscribe no matter how many they’re sent. 

    • 3 years ago
  • Chemistry and Expert Reasoning

    From How Do Chemists (Think That They) Judge Compounds:

    Interestingly, once the 19 chemists had made their choices (and reported the criteria they used in doing so), the authors went through the selections using two computational classification algorithms, semi-naïve Bayesian (SNB) and Random Forest (RF). This showed that most of the chemists actually used only one or two categories as important filters, a result that ties in with studies in other fields on how experts in a given subject make decisions. Reducing the complexity of a multifactorial problem is a key step for the human brain to deal with it; how well this reduction is done (trading accuracy for speed) is what can distinguish an expert from someone who’s never faced a particular problem before.

    But the chemists in this sample didn’t all zoom in on the same factors. One chemist showed a strong preference away from the compounds with a higher polar surface area, for example, while another seemed to make size the most important descriptor. The ones using functional groups to pick compounds also showed some individual preferences - one chemist, for example, seemed to downgrade heteroaromatic compounds, unless they also had a carboxylic acid, in which case they moved back up the list…

    Comparing structural preferences across the chemists revealed many differences of opinion as well. One of them seemed to like fused six-membered aromatic rings (that would not have been me, had I been in the data set!), while others marked those down. Some tricyclic structures were strongly favored by one chemist, and strongly disfavored by another, which makes me wonder if the authors were tempted to get the two of them together and let them fight it out…

    Then comes a key question: how similar were the chemists’ picks to each other, or to their own previous selections? A well-known paper from a few years ago suggested that the same chemists, looking at the same list after the passage of time (and more lists!) would pick rather different sets of compounds. Update: see the comments for some interesting inside information on this work.)Here, the authors sprinkled in a couple of hundred compounds that were present in more than one list to test this out. And I’d say that the earlier results were replicated fairly well. Comparing chemists’ picks to themselves, the average similarity was only 0.52, which the authors describe, perhaps charitably, as “moderately internally consistent”.

    But that’s a unanimous chorus compared to the consensus between chemists. These had similarities ranging from 0.05 (!) to 0.52, with an average of 0.28. Overall, only 8% of the compounds had the same judgement passed on them by at least 75% of the chemists. And the great majority of those agreements were on bad compounds, as opposed to good ones: only 1% of the compounds were deemed good by at least 75% of the group!

    There’s one other interesting result to consider: recall that the chemists were asked to state what factors they used in making their decisions. How did those compare to what they actually seemed to find important? (An economist would call this a case of stated preference versus revealed preference). The authors call this an assessment of the chemists’ self-awareness, which in my experience, is often a swampy area indeed. And that’s what it turned out to be here as well: “…every single chemist reported properties that were never identified as important by our SNG or RF classifiers…chemist 3 reported that several properties were important, for failed to report that size played any role during selections. Our SNG and RF classifiers both revealed that size, an especially straightforward parameter to assess, was the most important .”

    Source: pipeline.corante.com
    • 3 years ago
    • #mechanicsofreasoning
  • Morality and Economics

    From LiveJournal:

    As a practical matter, I have much better information about my own preferences than about the preferences of other humans, and between that and transaction costs, I have a comparative advantage seeking to satisfy my own preferences over seeking to satisfy the preferences of other humans.

    Source: squid314.livejournal.com
    • 3 years ago
  • Attention Markets

    From Your Online Attention, Bought in an Instant:

    On the Web, powerful algorithms are sizing you up, based on myriad data points: what you Google, the sites you visit, the ads you click. Then, in real time, the chance to show you an ad is auctioned to the highest bidder.

    Not so long ago, they simply bought ad spaces based on a site’s general demographics and then showed every visitor the same ad, a practice called “spray and pray.” Now marketers can aim just at their ideal customers — like football fans who earn more than $100,000 a year, or mothers in Denver in the market for an S.U.V. — showing them tailored ads at the exact moment they are available on a specific Web page.

    “We are not buying content as a proxy for audience,” says Paul Alfieri, the vice president for marketing at Turn, a data management company and automated buy-side platform for marketers based in Redwood City, Calif. “We are just buying who the audience is.”

    …Most sites, Mr. Addante explains, compile data about their own visitors through member registration or by placing bits of computer code called cookies on people’s browsers to collect information about their online activities. To those first-party profiles, Rubicon typically adds details from third-party data aggregators, like BlueKai or eXelate, such as users’ sex and age, interests, estimated income range and past purchases. Finally, Rubicon applies its own analytics to estimate the fair market value of site visitors and the ad spaces they are available to see.

    The whole process typically takes less than 30 milliseconds.

    …Real-time dashboards like Turn’s, he says, have modernized the online ad trade in the same way that Bloomberg terminals revolutionized Wall Street trading. Ad agencies and brands can now check the intraday prices for various impressions. Many ad agencies have even created in-house “trading desks” to monitor and adjust their bids.

    But Turn’s dashboard is more than a real-time ticker. It’s an analytics system that enables clients like insurers or car companies to identify common details among their best customer segments and then bid to show ads to people who resemble those best customers. The machine learning process gets better at pinpointing ideal audiences over the course of an ad campaign.

    For example, Turn recently ran an ad campaign for a sneaker company that initially chose to buy a wide variety of impressions nationwide. But as Turn’s system analyzed the early sets of results, it began to separate audiences into the kinds of people who clicked on those sneaker ads, or later searched for the shoes on their own, and those who did not. Identifying common details among those people required the system to comb through its databank of nearly a billion user profiles for each transaction.

    Source: The New York Times
    • 3 years ago
    • #econ
  • Pain and Signal

    The risks of being denied information. From The Hazards of Growing Up Painlessly:

    …Tara was afraid she would forget the words, so she asked him to write them down. The doctor took out a business card and wrote on the back: “Congenital insensitivity to pain.”

    …“For his birthday, he’d wanted to do something for his friends — he’d wanted to jump off the first-floor roof of his house,” Woods told me. “And he did. And he got up and said he was fine and died a day later because of hemorrhage. I realized that pain had a different meaning than I had thought. He didn’t have pain behavior to restrain him…“It is an extraordinary disorder,” Woods said. “Boys die at a younger age because of more risky behavior. It’s quite interesting, because it makes you realize pain is there for a number of reasons, and one of them is to use your body correctly without damaging it and modulating what you do.”

    …“Her life story offers an amazing snapshot of how complicated a life can get without the guidance of pain,” Staud said. “Pain is a gift, and she doesn’t have it.”

    Source: The New York Times
    • 3 years ago
  • Science and the Cost of Errors

    From Nassim Taleb on Scientific Discovery:

    What’s needed is an asymmetry: the errors need to be as painless as possible, compared to the payoffs of the successes. The mathematical equivalent of this property is called convexity; a nonlinear convex function is one with larger gains than losses. (If they’re equal, the function is linear). In research, this is what allows us to “harvest randomness”, as the article puts it.

    An example of such a process is biological evolution: most mutations are harmless and silent. Even the harmful ones will generally just kill off the one organism with the misfortune to bear them. But a successful mutation, one that enhances survival and reproduction, can spread widely. The payoff is much larger than the downside, and the mutations themselves come along for free, since some looseness is built into the replication process. It’s a perfect situation for blind tinkering to pay off: the winners take over, and the losers disappear.

    Taleb goes on to say that “optionality” is another key part of the process. We’re under no obligation to follow up on any particular experiment; we can pick the one that worked best and toss the rest. This has its own complications, since we have our own biases and errors of judgment to contend with, as opposed to the straightforward questions of evolution (“Did you survive? Did you breed?”). But overall, it’s an important advantage.

    The article then introduces the “convexity bias”, which is defined as the difference between a system with equal benefit and harm for trial and error (linear) and one where the upsides are higher (nonlinear). The greater the split between those two, the greater the convexity bias, and the more volatile the environment, the great the bias is as well. This is where Taleb introduces another term, “antifragile”, for phenomena that have this convexity bias, because they’re equipped to actually gain from disorder and volatility. (His background in financial options is apparent here). What I think of at this point is Maxwell’s demon, extracting useful work from randomness by making decisions about which molecules to let through his gate. We scientists are, in this way of thinking, members of the same trade union as Maxwell’s busy creature, since we’re watching the chaos of experimental trials and natural phenomena and letting pass the results we find useful. (I think Taleb would enjoy that analogy). The demon is, in fact, optionality manifested and running around on two tiny legs.

    Meanwhile, a more teleological (that is, aimed and coherent) approach is damaged under these same conditions. Uncertainty and randomness mess up the timelines and complicate the decision trees, and it just gets worse and worse as things go on. It is, by these terms, fragile.

    Taleb ends up with seven rules that he suggests can guide decision making under these conditions. I’ll add my own comments to these in the context of drug research.

    (1) Under some conditions, you’d do better to improve the payoff ratio than to try to increase your knowledge about what you’re looking for. One way to do that is to lower the cost-per-experiment, so that a relatively fixed payoff then is larger in comparison. The drug industry has realized this, naturally: our payoffs are (in most cases) somewhat out of our control, although the marketing department tries as hard as possible. But our costs per experiment range from “not cheap” to “potentially catastrophic” as you go from early research to Phase III. Everyone’s been trying to bring down the costs of later-stage R&D for just these reasons.

    (2) A corollary is that you’re better off with as many trials as possible. Research payoffs, as Taleb points out, are very nonlinear indeed, with occasional huge winners accounting for a disproportionate share of the pool. If we can’t predict these - and we can’t - we need to make our nets as wide as possible. This one, too, is appreciated in the drug business, but it’s a constant struggle on some scales. In the wide view, this is why the startup culture here in the US is so important, because it means that a wider variety of ideas are being tried out. And it’s also, in my view, why so much M&A activity has been harmful to the intellectual ecosystem of our business - different approaches have been swallowed up, and they they disappear as companies decide, internally, on the winners.

    And inside an individual company, portfolio management of this kind is appreciated, but there’s a limit to how many projects you can keep going. Spread yourself too thin, and nothing will really have a chance of working. Staying close to that line - enough projects to pick up something, but not so many as to starve them all - is a full-time job.

    (3) You need to keep your “optionality” as strong as possible over as long a time as possible - that is, you need to be able to hit a reset button and try something else. Taleb says that plans “…need to stay flexible with frequent ways out, and counter to intuition, be very short term, in order to properly capture the long term. Mathematically, five sequential one-year options are vastly more valuable than a single five-year option.” I might add, though, that they’re usually priced accordingly (and as Taleb himself well knows, looking for those moments when they’re not priced quite correctly is another full-time job).

    (4) This one is called “Nonnarrative Research”, which means the practice of investing with people who have a history of being able to do this sort of thing, regardless of their specific plans. And “this sort of thing” generally means a lot of that third recommendation above, being able to switch plans quickly and opportunistically. The history of many startup companies will show that their eventual success often didn’t bear as much relation to their initial business plan as you might think, which means that “sticking to a plan”, as a standalone virtue, is overrated.

    At any rate, the recommendation here is not to buy into the story just because it’s a good story. I might draw the connection here with target-based drug discovery, which is all about good stories.

    (5) Theory comes out of practice, rather than practice coming out of theory. Ex post facto histories, Taleb says, often work the story around to something that looks more sensible, but his claim is that in many fields, “tinkering” has led to more breakthroughs than attempts to lay down new theory. His reference is to this book, which I haven’t read, but is now on my list.

    (6) There’s no built-in payoff for complexity (or for making things complex). “In academia,” though, he says, “there is”. Don’t, in other words, be afraid of what look like simple technologies or innovations. They may, in fact, be valuable, but have been ignored because of this bias towards the trickier-looking stuff. What this reminds me of is what Philip Larkin said he learned by reading Thomas Hardy: never be afraid of the obvious.

    (7) Don’t be afraid of negative results, or paying for them. The whole idea of optionality is finding out what doesn’t work, and ideally finding that out in great big swaths, so we can narrow down to where the things that actually work might be hiding. Finding new ways to generate negative results quickly and more cheaply, which can means new ways to recognize them earlier, is very valuable indeed.

    Taleb finishes off by saying that people have criticized such proposals as the equivalent of buying lottery tickets. But lottery tickets, he notes, are terribly overpriced, because people are willing to overpay for a shot at a big payoff on long odds. But lotteries have a fixed upper bound, whereas R&D’s upper bound is completely unknown. And Taleb gets back to his financial-crisis background by pointing out that the history of banking and finance points out the folly of betting against long shots (“What are the odds of this strategy suddenly going wrong?”), and that in this sense, research is a form of reverse banking.

    Well, those of you out there who’ve heard the talk I’ve been giving in various venues (and in slightly different versions) the last few months may recognize that point, because I have a slide that basically says that drug research is the inverse of Wall Street. In finance, you try to lay off risk, hedge against it, amortize it, and go for the steady payoff strategies that (nonetheless) once in a while blow up spectacularly and terribly. Whereas in drug research, risk is the entire point of our business (a fact that makes some of the business-trained people very uncomfortable). We fail most of the time, but once in a while have a spectacular result in a good direction. Wall Street goes short risk; we have to go long.

    Source: pipeline.corante.com
    • 3 years ago
    • #mechanicsofscience
  • Overfitting

    From Wikipedia: 

    In statistics and machine learning overfitting occurs when a statistical model describes random error or noise instead of the underlying relationship. Overfitting generally occurs when a model is excessively complex, such as having too many parameters relative to the number of observations. A model which has been overfit will generally have poor predictive performance, as it can exaggerate minor fluctuations in the data.

    The possibility of overfitting exists because the criterion used for training the model is not the same as the criterion used to judge the efficacy of a model. In particular, a model is typically trained by maximizing its performance on some set of training data. However, its efficacy is determined not by its performance on the training data but by its ability to perform well on unseen data. Overfitting occurs when a model begins to memorize training data rather than learning to generalize from trend. As an extreme example, if the number of parameters is the same as or greater than the number of observations, a simple model or learning process can perfectly predict the training data simply by memorizing the training data in its entirety, but such a model will typically fail drastically when making predictions about new or unseen data, since the simple model has not learned to generalize at all.

    See also:

    Neural Nets and Occam’s Razor

    Source: Wikipedia
    • 3 years ago
    • #statistics
  • Formalization of Thought

    Yes, any physical system could be subverted with a sufficiently unfavorable environment. You wouldn’t want to prove perfection. The thing you would want to prove would be more along the lines of, “will this system become at least somewhere around as capable of recovering from any disturbances, and of going on to achieve a good result, as it would be if its designers had thought specifically about what to do in case of each possible disturbance?”. (Ideally, this category of “designers” would also sort of bleed over in a principled way into the category of “moral constituency”, as in CEV.) Which, in turn, would require a proof of something along the lines of “the process is highly likely to make it to the point where it knows enough about its designers to be able to mostly duplicate their hypothetical reasoning about what it should do, without anything going terribly wrong”.

    We don’t know what an appropriate formalization of something like that would look like. But there is reason for considerable hope that such a formalization could be found, and that this formalization would be sufficiently simple that an implementation of it could be checked. This is because a few other aspects of decision-making which were previously mysterious, and which could only be discussed qualitatively, have had powerful and simple core mathematical descriptions discovered for cases where simplifying modeling assumptions perfectly apply. Shannon information was discovered for the informal notion of surprise (with the assumption of independent identically distributed symbols from a known distribution). Bayesian decision theory was discovered for the informal notion of rationality (with assumptions like perfect deliberation and side-effect-free cognition). And Solomonoff induction was discovered for the informal notion of Occam’s razor (with assumptions like a halting oracle and a taken-for-granted choice of universal machine). These simple conceptual cores can then be used to motivate and evaluate less-simple approximations for situations where where the assumptions about the decision-maker don’t perfectly apply. For the AI safety problem, the informal notions (for which the mathematical core descriptions would need to be discovered) would be a bit more complex – like the “how to figure out what my designers would want to do in this case” idea above. Also, you’d have to formalize something like our informal notion of how to generate and evaluate approximations, because approximations are more complex than the ideals they approximate, and you wouldn’t want to need to directly verify the safety of any more approximations than you had to. (But note that, for reasons related to Rice’s theorem, you can’t (and therefore shouldn’t want to) lay down universally perfect rules for approximation in any finite system.)

    Via LessWrong

    Source: lesswrong.com
    • 3 years ago
    • #mechanicsofreasoning
© 2012–2016 Coarse Grained
Next page
  • Page 1 / 4