We develop a theoretical model that is capable of explaining the existence of sustainable common pool resource equilibria in the absence of external regulation. We combine ideas from the literature on social norms in an iterative game theory framework to establish the existence of multiple sustainable common pool equilibria. Summary: Consider a highly stylized common pool resource (CPR) allocation problem where individual agents have the opportunity to comply or defect from some benchmark behavior. For example, a recreational boater can choose to dispose of trash overboard or haul the trash to shore, or perhaps a polluting firm can choose to meet or exceed a governmental standard for emissions. Traditional neo-classical models of common pool resource allocations predict that rational self-interested agents will tend to over-exploit the CPR in the absence of external regulation. A number of recent studies have documented the existence of common pool resource outcomes that contradict this commonly held belief. In recent years, social norms have been gained much attention from the economists as an important driving force of individual behavior. A number of studies on the management of common pool resources (CPR) through endogenous institutions have argued for the importance of social norms in maintaining efficient, sustainable allocations in the absence of external regulation (Ostrom (1992), Bromley (1992), Baland and Platteau (1996)). Despite the empirical documentation of potentially sustainable outcomes, theoretical explanations remain in their infancy. Since Hardin's(1968) seminal work on the 'tragedy of the commons', a number of models based on neo-classical economics have tried to explain the existence of common pool resource equilibria consisting of partial compliance or defection (Sethi and Somanathan (1996), Haab and McConnell (2002)). Of particular interest, evolutionary models of compliance, which incorporate the behavioral outcomes of others into individual decision making, have proven popular in explaining collective behavior problems associated with sustainable CPR outcomes. However, these evolutionary models fail to incorporate two commonly observed characteristics of common pool resource decision environments: partial compliance equilibria and costly sanctioning behavior. Haab and McConnell (2002) develop a rudimentary evolutionary model of compliance behavior and show that heterogeneous distributions of compliance costs across a population can result in a less than full compliance equilibrium. The model of Haab and McConnell ignores the possibility of endogenous sanctioning of deviant behavior. Other evolutionary models of common pool behavior assume altruistic motives for sanctioning or costless sanctioning. The purpose of this paper is to provide a theoretical explanation for partial compliance equilibria in a common pool resource allocation problem with and without endogenous sanctioning behavior. First, the theoretical possibility of partial compliance equilibrium with exogenous sanctioning behavior and fully observable agent behavior, is presented. Next we introduce uncertainty into the model through the incomplete observation of agent behavior to examine why some agents willingly commit costly sanctioning behaviors. We show that even in the presence of costly, endogenous sanctioning, multiple CPR equilibria exist, including a stable partial compliance equilibrium. Finally, we compare the resulting CPR equilibria to determine the socially efficient outcome, and compare the costs of various policies of achieving such outcomes. An Overview of the Model and Outcomes: Reconsider the stylized CPR problem where individual decision agents can choose to undertake some behavior or not (a binary decision). Assume there are two types of agents: defectors and compliers. Further, compliers can be divided into sub-samples of sheer compliers and enforcers. Defectors do not follow the socially accepted norm and can be punished (sanctioned) by compliers when detected. Sheer compliers follow the social norm but pay a compliance cost (e.g. time value of waste disposal). Enforcers, a subset of compliers, willingly sanction defectors but pay a sanctioning cost (e.g. monitoring cost). The primary difference between the current model and other evolutionary models of behavior is found in the payoff function of the agents. We assume that agents can incur an internal cost of defection (e.g. humiliation) when an agent commits defecting behavior whether or not he is detected. Several studies (Coleman (1987), Kerr et al. (1994), Carwford and Ostrom (1995) and T.C Haab and K.E. McCornell (2001)) have already examined this internal cost. Further, the agent can earn a benefit through enhanced reputation by complying. (Hamilton (1964), Axelrod (1984), Nowak and Simund (1998) and Leimar and Mammerstein (2001)) Internal cost increases as the number of complier increases. Reputation only can be earned when compliance behavior is observed by others thus the size of community of CPR and the proportion of compliers are important. The benefits to defection increase as the number of compliers increases. For example, in a fishery, the more agents complying with the efficient fishing effort the more fish defectors can catch. However, the internal cost to defection increases as the number of compliers increases. The difference between this defecting benefit and defecting cost (internal cost + inflict from sanctioning) increases as the number of compliers increases. On the other hand, the marginal reputation benefit is increasing for low levels of compliance but after some level of compliers, reputation increases at the decreasing rate. A representative individual will compare the marginal net benefit of defection to the marginal reputation of compliance and will choose to defect if the marginal net benefit of defection is greater than the marginal payoff to compliance. A partial compliance equilibrium is sustainable because for compliance populations above the point where the marginal net benefits of defection exceed the marginal net benefits of compliance, the marginal net benefits of defection are greater than the marginal net benefits of compliance, leading to increasing defection. For compliance populations below the point where the marginal net benefits of defection equal the marginal net benefits of compliance, the marginal net benefits of defection are less than the marginal reputation of compliance leading to increasing compliance. With sufficient exogenous sanctioning behavior, by government or authorized party, the marginal net defecting benefit can be reduced thereby achieving a less than full compliance (or full defection) equilibrium. In practice, it's possible that the sanctioning behavior is enforced by endogenous members. The interesting features of this endogenous sanctioning behavior are that it is not only sufficient to maintain sustainable level of CPR but also enforced by rational individuals who would not sanction unless there were a reward. When individual behavior can be observed fully or partially, sanctioning behavior is rewarded in a form of reputation or prestige. Thus there may be sufficient incentive to overcome costly sanctioning behavior. In some environments where information of individual behavior is uncertain, people have incentive to monitor others' strategies in order not to be fooled in the next stage of game. Therefore it can be easily inferred that there is positive relationship between costly sanctioning behavior and earning reputation or getting information of others' behavior, which depends on the availability of information of others' behaviors. In conclusion, with the assumption of sufficient internal cost for defecting behavior and reputation for compliance behavior, a less than full compliance equilibrium with exogenous sanctioning can be explained. Even with endogenous sanctioning, a stable partial compliance equilibrium is possible if people have the incentive to either gather information on other's behavior (in the hidden action case) or to earn reputation in partially observable action case.