With technological advancements, cyber attacks are highly automated and organized with asymmetric advantages over defenders regarding cost and efforts.Attackers employ sophisticated and diversified approaches to achieve attack objectives while being stealthy. Therefore, enterprises strive for autonomous frameworks to optimize cybersecurity planning through addressing uncertainties related to attackers and the environment. However, most existing cybersecurity defense solutions are static and highly rely on human expertise, which inevitably decreases odds for defenders. This dissertation aims to advance state of the art by developing new models and frameworks to enact automated and dynamic defense planning optimization for cyber risk and attack mitigation. This dissertation has three objectives: (1) developing a framework to compute optimal cyber risk mitigation planning, (2) developing a framework to achieve real-time defense optimization against strategical cyber attacks in a stochastic environment, and (3) developing defense models to cope with dynamic attack and environment behavior.In the second chapter, this dissertation presents formal models of an automated cyber risk mitigation framework, named CyberARM, to compose an optimal set of cybersecurity defense controls as cybersecurity portfolio for an enterprise. Computing a cost-effective portfolio to optimize Return on Investment (ROI) is still a highly complex and error-prone task due to a large number of security controls, and correlated risk factors (e.g., vulnerabilities and attack techniques) of an escalated and diversified attack surface. CyberARM formulates the decision-making problem as Constraint Satisfaction Problem (CSP) to compute a correct-by-construction cybersecurity portfolio. The computed portfolio wants to maximize ROI considering the increasingly evolving threat actions after satisfying all user requirements (e.g., budget and mission-oriented constraints). Moreover, the computed portfolio answers three fundamental questions: (1) ``what" security controls are needed for ``which'' security function (i.e., Identify, Detect, Protect, Respond, and Recover), (2) ``where" to enforce (Network, Device, People, Application, and Data), and (3) ``why" it is effective in the cyber attack kill chain phases. The evaluation results show that CyberARM can approximate a cost-effective cybersecurity portfolio for large enterprises applying its model reduction and decomposition approaches.In the third chapter, this dissertations presents a multi-agent distributed cyber defense framework, named Horde, to defend sophisticated Infrastructural Distributed Denial of Service (I-DDoS) attacks autonomously. In I-DDoS attacks, attackers target core backhaul links to impede the availability of critical networks or servers while avoiding end-system defenses.Despite the extensive efforts in developing DDoS mitigation solutions, the sophistication and potential impact of I-DDoS attacks continue to grow significantly. To protect critical network links, Horde assigns autonomous agents that compute cost-effective composition of defense tactics (i.e., limiting, filtering, diversion, rerouting) dynamically at real-time, considering the expected behavior of I-DDoS attackers and the network. It establishes automated collaboration among agents to share spare bandwidth for rerouting prioritized traffic of congested links through alternative routes. Horde formulates the problem of an agent's decision-making using Reinforcement Learning (RL) and applies Partially Observable Markov Decision Process (POMDP) to solve it, after reasoning over uncertainties of decision parameters.In the fourth chapter, this dissertation presents models aiming to infer the expected behavior of attackers and environment to integrate into decision-making, in order to confront dynamic I-DDoS attackers in an uncertain environment. Most existing game-theoretic DDoS frameworks struggle due to static assumptions on attackers and critical environmental parameters. This chapter presents incremental and online approaches to learn currently adopted attack strategies and critical decision parameters of the network without requiring deep domain knowledge. The autonomous defense model enables Horde agents to evolve dynamically through observing, hypothesizing, and investigating and deciding via the interaction experience with the environment and attackers. This enables to not only observe and respond to I-DDoS attacks timely and effectively but also exhibit a robust behavior against evasion and deception attacks. The evaluation results on diversified attack strategies show that Horde agents can serve more than 97\% benign traffic despite dynamic attack and network behavior and attack detection inaccuracies.