CS 188 | Spring 2021 Project 2: Multi-Agent Search

Experiment 2: Pac Man (against search)

1, Project description

Title Page

Project code blank frame

         In this project, we will design agents for the classic version of Pacman, including ghosts. In the process, you will implement minimax and expectimax searches and try to evaluate the function design.

         Only five topics need to be completed to complete the operation. Follow the steps described in the project, mainly to supplement the code in the multiAgents.py file

Experimental documents

Documents to be preparedremarks
multiAgents.pyAll search agents will be in this file
Files to view
pacman.pyThe main file that runs the Pac Man game. This file contains the GameState class, which will be widely used in the game
game.pyThis file implements the running logic of Pac Man game and contains several supporting classes such as AgentState, Agent, Direction and Grid
util.pyThis file contains the data structure needed to implement the search algorithm
Negligible supporting documents
ghostAgents.pyThis file controls the ghost agent
graphicsDisplay.pyThis file realizes the graphical interface of Pac Man game
graphicsUtils.pyThis file supports the graphical interface of Pac Man game
keyboardAgents.pyThe file realizes the control of bean eaters through the keyboard
layout.pyThis file contains the code to read the game layout file and save the layout content
textDisplay.pyThis file provides graphics in ASCII format for Pac Man game

2, Experimental code

         The code of question 1 is omitted. It mainly makes detailed comments on questions 2 and 3. I hope it can be helpful to you;

Question 2: Minimax

class MinimaxAgent(MultiAgentSearchAgent):
    """
    Your minimax agent (question 2)
    """

    def getAction(self, gameState):
        """
        Returns the minimax action from the current gameState using self.depth
        and self.evaluationFunction.

        Here are some method calls that might be useful when implementing minimax.

        gameState.getLegalActions(agentIndex):
        Returns a list of legal actions for an agent
        agentIndex=0 means Pacman, ghosts are >= 1

        gameState.getNextState(agentIndex, action):
        Returns the child game state after an agent takes an action

        gameState.getNumAgents():
        Returns the total number of agents in the game

        gameState.isWin():
        Returns whether or not the game state is a winning state

        gameState.isLose():
        Returns whether or not the game state is a losing state
        """

        "*** YOUR CODE HERE ***"
        # Imp serial number list
        GhostIndex = [i for i in range(1, gameState.getNumAgents())]
        # Objective function: recursive termination condition, which ends whether it is lost, won or traversed. Find the maximum value in combination with the evaluation function
        def term(state, d):
            return state.isWin() or state.isLose() or d == self.depth

        # Return the corresponding latest value according to the phantom serial number
        def min_value(state, d, ghost):  # minimizer
            # If the termination condition of the objective function is reached, the maximum value is returned
            if term(state, d):
                return self.evaluationFunction(state)

            "Value for Min node. May have multiple ghosts"
            # The initial evaluation value of min function is positive infinity
            v = float("inf")
            #Recursive evaluation
            for action in state.getLegalActions(ghost):
                # After traversing one layer, change to the next layer
                if ghost == GhostIndex[-1]:
                    v = min(v, max_value(state.getNextState(ghost, action), d + 1))
                else:
                    v = min(v, min_value(state.getNextState(ghost, action), d, ghost + 1))
            # Returns the best minimum value
            return v

        # The best maximum value is obtained as above (for pcman)
        def max_value(state, d):  # maximizer
            # If the termination condition of the objective function is reached, the maximum value is returned
            if term(state, d):
                return self.evaluationFunction(state)

            "Value for Max node"
            # The initial evaluation value of max function is negative infinity
            v = float("-inf")
            for action in state.getLegalActions(0):
                # There is only one Doudou man, so you can jump directly to kid 1
                # Note: the number of layers is controlled in min_value, so there is no need to add 1 to d here. If you add 1, the number of layers will be doubled
                #     That's why some codes on the Internet multiply 2 in the term() function. That's the difference
                v = max(v, min_value(state.getNextState(0, action), d, 1))
            # print(v)
            return v

        "Select action for Max node"
        # Start the actual function call,
        # Note: Although the pcman method of maximum benefit is required, min_value must be called here,
        # Because the last max_value operation is actually simulated here, because the ultimate goal we want to return is getaction() -- > action
        # Using list sorting, you can skillfully extract the action corresponding to the maximum value, that is, the last return res[-1][0]
        res = [(action, min_value(gameState.getNextState(0, action), 0, 1)) for action in
               gameState.getLegalActions(0)]
        res.sort(key=lambda k: k[1])

        return res[-1][0]
    # util.raiseNotDefined()

Question3: Alpha-Beta Pruning

class AlphaBetaAgent(MultiAgentSearchAgent):
    """
    Your minimax agent with alpha-beta pruning (question 3)
    """

    def getAction(self, gameState):
        """
        Returns the minimax action using self.depth and self.evaluationFunction
        """
        "*** YOUR CODE HERE ***"
        # Adding pruning function, the general framework of the code remains unchanged. Only adding A and B can better improve the efficiency
        # But what I want to say here is: because min and Max alternate,
        # That means that the values of a and B alternate with each other
        # The so-called pruning is the upper layer's constraint on the lower layer, and this constraint originates from the lower layer (the previous node)
        #
        GhostIndex = [i for i in range(1, gameState.getNumAgents())]

        def term(state, d):
            return state.isWin() or state.isLose() or d == self.depth

        def min_value(state, d, ghost, A, B):  # minimizer

            if term(state, d):
                return self.evaluationFunction(state)

            v = float("inf")
            # Note: Action: refers to the action of ghost,
            for action in state.getLegalActions(ghost):
                if ghost == GhostIndex[-1]:  # next is maximizer with pacman
                    v = min(v, max_value(state.getNextState(ghost, action), d + 1, A, B))
                else:  # next is minimizer with next-ghost
                    v = min(v, min_value(state.getNextState(ghost, action), d, ghost + 1, A, B))

                # Main differences:
                # Note: each sub block will execute this code, v - > max_value, and v is constantly getting smaller,
                # Once V < A, this action will inevitably lose its competitiveness, and the subsequent action will only be smaller than him, so pruning
                # v=B
                if v < A:
                    return v
                # The meaning of this line of code:
                # Note: this line of code will be run once in each loop, constantly lowering (or increasing the optimal value) to maximize the approximation of the following code
                #
                B = min(B, v)

            return v

        def max_value(state, d, A, B):  # maximizer

            if term(state, d):
                return self.evaluationFunction(state)

            v = float("-inf")
            for action in state.getLegalActions(0):
                v = max(v, min_value(state.getNextState(0, action), d, 1, A, B))

                # Ditto, cut out when crossing the boundary
                # v=A
                if v > B:
                    return v
                A = max(A, v)

            return v

        def alphabeta(state):

            v = float("-inf")
            act = None
            A = float("-inf")
            B = float("inf")

            for action in state.getLegalActions(0):  # maximizing
                tmp = min_value(gameState.getNextState(0, action), 0, 1, A, B)

                # Note that v here will be the last B, and record the action accordingly
                if v < tmp:  # same as v = max(v, tmp)
                    v = tmp
                    act = action

                # Last traversal
                if v > B:  # pruning
                    return v
                A = max(A, tmp)

            return act

        return alphabeta(gameState)
        # util.raiseNotDefined()

Question 4: Expectimax

class ExpectimaxAgent(MultiAgentSearchAgent):
    """
      Your expectimax agent (question 4)
    """

    def getAction(self, gameState):
        """
        Returns the expectimax action using self.depth and self.evaluationFunction

        All ghosts should be modeled as choosing uniformly at random from their
        legal moves.
        """
        "*** YOUR CODE HERE ***"
        GhostIndex = [i for i in range(1, gameState.getNumAgents())]

        def term(state, d):
            return state.isWin() or state.isLose() or d == self.depth

        def exp_value(state, d, ghost):  # minimizer

            if term(state, d):
                return self.evaluationFunction(state)

            v = 0
            # Each probability is 1/count
            # After count takes a step for pacam, how many states can ghost go
            # It is possible that pacman will end the game after walking. At this time, count = 0. At this time, you only need to directly take the maximum value of each sumvalue
            prob = 1 / len(state.getLegalActions(ghost))
            #
            for action in state.getLegalActions(ghost):
                if ghost == GhostIndex[-1]:
                    v += prob * max_value(state.getNextState(ghost, action), d + 1)
                else:
                    v += prob * exp_value(state.getNextState(ghost, action), d, ghost + 1)
            # print(v)
            return v

        def max_value(state, d):  # maximizer

            if term(state, d):
                return self.evaluationFunction(state)

            v = -10000000000000000
            for action in state.getLegalActions(0):
                v = max(v, exp_value(state.getNextState(0, action), d, 1))
            # print(v)
            return v

        res = [(action, exp_value(gameState.getNextState(0, action), 0, 1)) for action in
               gameState.getLegalActions(0)]
        res.sort(key=lambda k: k[1])

        return res[-1][0]
        # util.raiseNotDefined()

Question 5: Evaluation Function

def betterEvaluationFunction(currentGameState):
    """
    Your extreme ghost-hunting, pellet-nabbing, food-gobbling, unstoppable
    evaluation function (question 5).

    DESCRIPTION: <write something here so we know what you did>
    """
    "*** YOUR CODE HERE ***"
    newPos = currentGameState.getPacmanPosition()
    newFood = currentGameState.getFood().asList()
    newGhostStates = currentGameState.getGhostStates()
    newScaredTimes = [ghostState.scaredTimer for ghostState in newGhostStates]

    # Based on the current game state, the evaluation function is given by the distance from the nearest food particle. If there is no particle, it is 0
    eval = currentGameState.getScore()
    foodDist = float("inf")
    for food in newFood:
        foodDist = min(foodDist, util.manhattanDistance(food, newPos))
    eval += 1.0 / foodDist

    return eval
    # util.raiseNotDefined()

3, Test screenshot

         When the test framework is written by others, it can be called directly;

MinimaxTesting:

Test case:

python autograder.py -q q2

Screenshot:

 

Alpha-Beta Pruning Testing :

Test case:

python autograder.py -q q3 --no-graphics

  Screenshot:

 

 

 

 Expectimax Testing:

  Test case:

python autograder.py -q q4 --no-graphics

  Screenshot:

 

Evaluation Function Testing :

Test case:

python autograder.py -q q5 --no-graphics

  Screenshot:

 

 

 

 

 

 

 

 

Tags: Python Pycharm

Posted on Thu, 16 Sep 2021 17:05:46 -0400 by andrests